In a similar manner to the consecutive loops discussed in the previous section, it requires additional clock cycles to move between rolled nested loops. It requires one clock cycle to move from an outer loop to an inner loop and from an inner loop to an outer loop.
In the small example shown here, this implies 200 extra clock cycles
to execute loop Outer
.
void foo_top { a, b, c, d} {
...
Outer: while(j<100)
Inner: while(i<6) // 1 cycle to enter inner
...
LOOP_BODY
...
} // 1 cycle to exit inner
}
...
}
Vitis HLS provides the set_directive_loop_flatten
command to allow labeled perfect and semi-perfect
nested loops to be flattened, removing the need to re-code for optimal hardware performance
and reducing the number of cycles it takes to perform the operations in the loop.
- Perfect loop nest
- Only the innermost loop has loop body content, there is no logic specified between the loop statements and all the loop bounds are constant.
- Semi-perfect loop nest
- Only the innermost loop has loop body content, there is no logic specified between the loop statements but the outermost loop bound can be a variable.
For imperfect loop nests, where the inner loop has variables bounds or the loop body is not exclusively inside the inner loop, designers should try to restructure the code, or unroll the loops in the loop body to create a perfect loop nest.
When the directive is applied to a set of nested loops it should be applied to the inner most loop that contains the loop body.
set_directive_loop_flatten top/Inner
Loop flattening can also be performed using the directive tab in the IDE, either by applying it to individual loops or applying it to all loops in a function by applying the directive at the function level.