Loops can be flattened completely with the chess_flatten_loop
pragma. This can be useful for small loops that are
not optimally automated by the AI Engine compiler.
chess_loop_count
pragma. For
example:for(int i=0;i<6;i++) chess_flatten_loop {...}
for(...) chess_loop_count(6) chess_flatten_loop {...}
With chess_unroll_loop(N)
, the loop
body can be duplicated N-1
times, and the loop
count is divided by N
. The loop can also be
completely unrolled by chess_unroll_loop(*)
. The
loop is unrolled and rewritten as a repeated sequence of similar independent
statements.
The loop flattening is done in the final scheduling phase, such that the code generation, is still done based on the loop construct. Unlike loop flattening, loop unrolling duplicates iterations of code, and the duplicated codes can be compiled differently. This can be used to allow for better software pipelining of loops. But it can also pose a burden on scheduling when the unrolled loop count is large.