Pipelining loop3 and new analysis - 2024.1 English - XD261

Vitis Tutorials: Vitis HLS (XD261)

Document ID
XD261
Release Date
2024-06-19
Version
2024.1 English

Let’s try to improve the performance of loop3 by using a pipeline pragma on the innermost-loop: on the graph, right click on the process and select “goto source”, this brings you to the call site.

Code Analyzer process loop3 goto source

Insert a new line after the for statement of the innermost loop and type #pragma HLS PIPELINE II=1. You can see that the editor suggest you the possible completions as you type. Save the file (note: it should be saved automatically) and run again C SIMULATION, then open Code Analyzer in the report section. Notice the updated Transaction Interval estimation TI=290. Expand the code by using the down pointing arrow.

Code Analyzer process loop3 after pipeline

We can observe that Code Analyzer took into account the Pipeline Pragma and estimated that II=1 was now achievable. Similar to before, this means the innermost loop now achieves TI=TRIPCOUNT*II=16*1=16.

Because the innermost loop is pipelined, this means the hardware module generated is independent from the outermost loop module and the innermost loop will have its own finite state machine (FSM). The outermost loop will also have its own FSM, so it will take an extra 2 clock cycles to enter and exit the inner loop FSM; for this reason, the outermost loop’s II is not only the sum of all TI from statements like in the previous situation but we need to add those 2 more cycles, II=16+2=18.

You don’t need to remember all the details but it’s good to have a good understanding of the estimations and the way they are computed and influence their parent hierarchy.

The Transaction Internal of the outermost loop will be TI=TRIPCOUNT*II=16*18=288.

For the extracted process itself, we add 2 extra cycles for the finite state machine of the process which gives the overall Transaction Interval for the pipelined loop3 of TI=288+2=290.

We can check and analyze the other processes but there are some simplifications and optimizations that we can do perform. Let’s look at the Channel table to see how it can help us.