Pipelining for Throughput - 2023.1 English

Vitis Tutorials: Hardware Acceleration (XD099)

Document ID
Release Date
2023.1 English

High-level synthesis can be very conservative by default, for example, loop body instructions are entirely executed at each iteration instead of executing in a staggered fashion. That latter style of execution is explicitely enabled by the PIPELINE pragma, it then reduces the II for a function or loop (here in this tutorial, it is applied on loops) by allowing the concurrent execution of the different operations. A pipelined function or loop can then process new inputs every N clock cycles, where N is the II of the loop or function. The default II for the PIPELINE pragma is 1, which processes a new input every clock cycle. You can also specify the initiation interval through the use of the II option.

Pipelining a loop allows its operations to be implemented so that these operations execute concurrently as shown in the following animated figure. In that example and by default, there are three clock cycles between each input read (so II=3), and it requires 12 clock cycles fully execute the loop compared to 6 when the pragma is used.