In a dataflow system like the one created with this approach, the slowest task will be the bottleneck.
Throughput(Kernel) = min(Throughput(Task1), Throughput(Task2), …,
Throughput(TaskN))
Therefore, during the decomposition process, always have the kernel throughput goal in mind and assess whether each sub-function will be able to satisfy this throughput goal.
In the following steps of this methodology, the developer will get actual throughput numbers from running the Vitis HLS compiler. If these results cannot be improved, the developer will have to iterate and further decompose the compute stages.