Scheduling Separator - 2022.1 English

AI Engine Kernel Coding Best Practices Guide (UG1079)

Document ID

UG1079

Release Date

2022-05-25

Version

2022.1 English

AI Engine can reorder data movements and expressions in a basic block. When data dependence is not seen by the compiler, or you want to intentionally insert a sequential point in the code, the pragma chess_separator_scheduler() can be used to separate parts of the code, as follows:

func(...){
  //code block 1
  chess_separator_scheduler()
  //code block 2
}

The chess_separator_scheduler(N) pragma is another form of the chess_separator_scheduler() pragma, where N indicates additional N cycles between two blocks. N can be positive or negative. With a negative offset, you can allow a partial overlap (up to absolute N cycles) between two blocks.

For example, the compiler does not have knowledge about dependence between the different streams of the kernel. It can schedule different stream reads or writes in the same cycle. If any stream read or write stalls the kernel, it relies on an external source to supply or consume the data to or from the kernel. In the following example code, the stream write (to out) and read (from receive_back) can be scheduled in the same cycle.

void producer(output_stream<int32> *out, input_stream<int32> *receive_back){
  int32 data;
  ...
  writeincr(out,data); //schedule in the same cycle
  readincr(receive_back); //schedule in the same cycle
  ...
}

The above kernel will stall if there is no data read from stream receive_back. Thus, there will be no data to be sent to stream out. If an external source must receive data from code before sending data to receive_back, the kernel will stall and cannot be recovered. To schedule the stream operations as needed, the chess_separator_scheduler(N) can be added, as follows:

void producer(output_stream<int32> *out, input_stream<int32> *receive_back){
  int32 data;
  ...
  writeincr(out,data);
  //Make sure read occurs after write and data is sent out before stalled
  chess_separator_scheduler(1);
  readincr(receive_back); 
  ...
}