Nested Loops - 2020.2 English

Vitis Unified Software Platform Documentation: Application Acceleration Development (UG1393)

Document ID
UG1393
Release Date
2021-03-22
Version
2020.2 English

Coding with nested loops is a common practice. Understanding how loops are pipelined in a nested loop structure is key to achieving the desired performance.

If the HLS PIPELINE pragma is applied to a loop nested inside another loop, the v++ compiler attempts to flatten the loops to create a single loop, and apply the PIPELINE pragma to the constructed loop. The loop flattening helps in improving the performance of the kernel.

The compiler is able to flatten the following types of nested loops:

  1. Perfect nested loop:
    • Only the inner loop has a loop body.
    • There is no logic or operations specified between the loop declarations.
    • All the loop bounds are constant.
  2. Semi-perfect nested loop:
    • Only the inner loop has a loop body.
    • There is no logic or operations specified between the loop declarations.
    • The inner loop bound must be a constant, but the outer loop bound can be a variable.

The following code example illustrates the structure of a perfect nested loop:

ROW_LOOP: for(int i=0; i< MAX_HEIGHT; i++) {
  COL_LOOP: For(int j=0; j< MAX_WIDTH; j++) {
    #pragma HLS PIPELINE
    // Main computation per pixel
  }
}

The above example shows a nested loop structure with two loops that performs some computation on incoming pixel data. In most cases, you want to process a pixel in every cycle, hence, PIPELINE is applied to the nested loop body structure. The compiler is able to flatten the nested loop structure in the example because it is a perfect nested loop.

The nested loop in the preceding example contains no logic between the two loop declarations. No logic is placed between the ROW_LOOP and COL_LOOP; all of the processing logic is inside the COL_LOOP. Also, both the loops have a fixed number of iterations. These two criteria help the v++ compiler flatten the loops and apply the PIPELINE constraint.

Recommended: If the outer loop has a variable boundary, then the compiler can still flatten the loop. You should always try to have a constant boundary for the inner loop.