By default, every iteration of a loop only starts when the previous iteration
has finished. In the loop example below, a single iteration of the loop adds two
variables and stores the result in a third variable. Assume that in hardware this loop
takes three cycles to finish one iteration. Also, assume that the loop variable len
is 20, that is, the vadd
loop runs for 20 iterations in the kernel. Therefore, it requires a
total of 60 clock cycles (20 iterations * 3 cycles) to complete all the operations of
this loop.
vadd: for(int i = 0; i < len; i++) {
c[i] = a[i] + b[i];
}
vadd:…
). This practice helps with debugging when working in the Vitis core development kit. Note that the labels
generate warnings during compilation, which can be safely ignored.vadd: for(int i = 0; i < len; i++) {
#pragma HLS PIPELINE
c[i] = a[i] + b[i];
}
In the example above, it is assumed that every iteration of the loop takes three cycles: read, add, and write. Without pipelining, each successive iteration of the loop starts in every third cycle. With pipelining the loop can start subsequent iterations of the loop in fewer than three cycles, such as in every second cycle, or in every cycle.
The number of cycles it takes to start the next iteration of a loop is called
the initiation interval (II) of the pipelined loop. So II = 2 means each successive
iteration of the loop starts every two cycles. An II = 1 is the ideal case, where each
iteration of the loop starts in the very next cycle. When you use pragma HLS PIPELINE
, the compiler always tries to achieve
II = 1 performance.
The following figure illustrates the difference in execution between pipelined and non-pipelined loops. In this figure, (A) shows the default sequential operation where there are three clock cycles between each input read (II = 3), and it requires eight clock cycles before the last output write is performed.
If there are data dependencies inside a loop, as discussed in Loop Dependencies, it might not be possible to achieve II = 1, and a larger initiation interval might be the result.