An alternative strategy partitions the Hough Transform so each tile sees the full image. Each tile computes only a partial transform over a \(\theta\) subset. This achieves linear compute reduction but no input bandwidth reduction. This scheme benefits from linear reduction in tile histogram storage. Collect histogram outputs from each tile without extra compute workload.