The AI Engine
compiler can reorder data movements and expressions in a basic block. When data
dependence is not seen by the compiler, or when you want to intentionally insert a
sequential point in the code, the pragma chess_separator_scheduler()
can be used to separate parts of the code,
as follows:
func(...){
//code block 1
chess_separator_scheduler();
//code block 2
}
The chess_separator_scheduler(N)
pragma is
another form of the chess_separator_scheduler()
pragma, where N
indicates additional N cycles
between two blocks. N
can be positive or negative.
With a negative offset, you can allow a partial overlap (up to absolute N cycles)
between two blocks.
For example, the compiler does not have knowledge about dependence between the
different streams of the kernel. It can schedule different stream reads or writes in
the same cycle. If any stream read or write stalls the kernel, it relies on an
external source to supply or consume the data to or from the kernel. In the
following example code, the stream write (to out
)
and read (from receive_back
) can be scheduled in
the same cycle.
void producer(output_stream<int32> *out, input_stream<int32> *receive_back){
int32 data;
...
writeincr(out,data); //schedule in the same cycle
readincr(receive_back); //schedule in the same cycle
...
}
The above kernel will stall if there is no data read from stream receive_back
. Thus, there will be no data sent to
stream out
. If an external source must receive data
from producer
before sending data to receive_back
, the kernel stalls and cannot be
recovered. To schedule the stream operations in different cycles, the chess_separator_scheduler(N)
can be added, as
follows:
void producer(output_stream<int32> *out, input_stream<int32> *receive_back){
int32 data;
...
writeincr(out,data);
//Make sure read occurs after write and data is sent out before a possible stall
chess_separator_scheduler(1);
readincr(receive_back);
...
}