The AI Engine graph allows memory-mapped connections to and from the global memory. The global memory can be either High Bandwidth Memory (HBM) on the device or DDR memory external to the device. Linear access to data between AI Engine and global memory through memory-mapped connections is possible through GMIO objects. To learn how to use GMIO in the AI Engine graph code, refer to Configuring input_gmio/output_gmio in the AI Engine Kernel and Graph Programming Guide (UG1079).
An alternative to GMIO objects is supported in AI Engine-ML devices. The alternative to GMIO objects for memory-mapped connections between global memory AI Engine, in the graph, is external buffer objects. Unlike GMIOs, which access data in a linear manner, external buffers can use advanced and intricate data access patterns, using tiling parameters, in the graph. For more information on external buffer usage, refer to AI Engine-ML External Memory Access in the AI Engine-ML Kernel and Graph Programming Guide (UG1603).
The host code to control data transfers remains the same whether
using GMIOs or external buffers to access data in global memory. Both methods
support synchronous and asynchronous data transfer. For asynchronous data transfer,
the async
API of the xrt::aie::bo
object manages the transfer and requires the GMIO or
external buffer object name to be specified as the first parameter.
char* xclbinFilename = argv[1];
// Open xclbin
auto device = xrt::device(0); //device index=0
auto uuid = device.load_xclbin(xclbinFilename);
//Only non-cacheable buffer is supported
auto din_buffer = xrt::aie::bo (device, BLOCK_SIZE_in_Bytes, xrt::bo::flags::normal, /*memory group*/0);
int* dinArray= din_buffer.map<int*>();
//Only non-cacheable buffer is supported
auto dout_buffer = xrt::aie::bo (device, BLOCK_SIZE_in_Bytes, xrt::bo::flags::normal, /*memory group*/0);
int* doutArray= dout_buffer.map<int*>();
int ret=0;
int error=0;
//Initialization
for(int i=0;i<ITERATION*1024/4;i++){
dinArray[i]=i;
}
// Parameter "gr.gmPortIn" can be the name of a GMIO/external buffer object
din_buffer.async("gr.gmPortIn",XCL_BO_SYNC_BO_GMIO_TO_AIE,BLOCK_SIZE_in_Bytes,/*offset*/0);
auto ghdl=xrt::graph(device,uuid,"gr");
ghdl.run(ITERATION);
// Parameter "gr.gmPortOut" can be the name of a GMIO/external buffer object
auto out_buffer_run=dout_buffer.async("gr.gmPortOut",XCL_BO_SYNC_BO_AIE_TO_GMIO,BLOCK_SIZE_in_Bytes,/*offset*/0);
ghdl.wait();//Wait for graph to complete
dout_buffer_run.wait();//Wait for gmioOut to complete
// Post-processing
...
xrt::aie::buffer
object to represent the buffer of
the AI Engine. The following
code uses this object to perform asynchronous buffer transactions similar to the code
above:
//Only non-cacheable buffer is supported
auto din_buffer = xrt::aie::bo (device, BLOCK_SIZE_in_Bytes, xrt::bo::flags::normal, /*memory group*/0);
//Only non-cacheable buffer is supported
auto dout_buffer = xrt::aie::bo (device, BLOCK_SIZE_in_Bytes, xrt::bo::flags::normal, /*memory group*/0);
//"gr.gmioIn" is the name of the buffer which represents GMIO/External Buffer
xrt::aie::buffer bufIn(device, uuid, "gr.gmioIn");
bufIn.async(din_buffer, XCL_BO_SYNC_BO_GMIO_TO_AIE, BLOCK_SIZE_in_Bytes, 0);
xrt::aie::buffer bufOut(device, uuid, "gr.gmioOut");
bufOut.async(dout_buffer, XCL_BO_SYNC_BO_AIE_TO_GMIO, BLOCK_SIZE_in_Bytes, 0);
bufOut.wait();
For more details on the buffer object API support, refer to the xrt_aie.h file in the XRT Repository.