Asynchronous Window Access - 2022.2 English

AI Engine Kernel and Graph Programming Guide (UG1079)

Document ID
Release Date
2022.2 English

In some situations, if you are not consuming a windows worth of data on every invocation of a kernel, or if you are not producing a windows worth of data on every invocation, then you can control the buffer synchronization by declaring the kernel port to be async as shown in the following.

connect< window<128, 32> net1 (in, async([0]));
connect< window<128, 32> net2 (async(first.out[0]), out); 

This declaration tells the compiler to omit synchronization of the window buffer upon entry to the kernel. You must use window synchronization APIs shown inside the kernel code before accessing the window using read/write APIs, as shown in the following.

void super_kernel(input_window<int32> * data, output_window<int32> * result) {
  window_acquire(data);     // acquire input window unconditionally inside the kernel
  if (<somecondition>) {
    window_acquire(result); // acquire output window conditionally 
  ...                       // do some computation with "data" and "result"
  window_release(data);     // release input window inside the kernel
  if (<somecondition>) {  
    window_release(result); // release output window conditionally

The window_acquire API performs the appropriate synchronization and initialization to ensure that the window object is available for read or write. The API keeps track of the appropriate buffer pointers and locks to be acquired internally, even if the window is shared across AI Engine processors and can be double-buffered. This API can be called unconditionally or conditionally under dynamic control, and is potentially a blocking operation. It is your responsibility to ensure that the corresponding window_release API is executed some time later (possibly even in a subsequent kernel call) to release the lock associated with that window object. Incorrect synchronization can lead to a deadlock in your code.