HW Emulation / HW Run - 2024.2 English

Vitis Libraries

Release Date
2025-03-21
Version
2024.2 English

For x86Simulation, the AIE simulation and top level application had simple ADF API calls to initialize / run / end the graph. However, for actual AI Engine graph applications the host code must do much more than those simple tasks. The top-level PS application running on the Cortex®-A72 controls the graph and PL kernels: manage data inputs to the graph, handle data outputs from the graph, and control any PL kernels working with the graph. Sample code is illustrated below.

1.// Open device, load xclbin, and get uuid

xF::deviceInit(xclBinName);

2. Allocate output buffer objects and map to host memory

void* srcData = nullptr;
xrt::bo dst_hndl = xrt::bo(xF::gpDhdl, (srcImageR.total() * srcImageR.elemSize()), 0, 0);
srcData = src_hndl.map();
memcpy(srcData, srcImageR.data, (srcImageR.total() * srcImageR.elemSize()));

3. Get kernel and run handles, set arguments for kernel, and launch kernel.
xrt::kernel s2mm_khdl = xrt::kernel(xF::gpDhdl, xF::xclbin_uuid, "s2mm"); // Open kernel handle
xrt::run s2mm_rhdl = s2mm_khdl(out_bohdl, nullptr, OUTPUT_SIZE); // set kernel arg

// Update graph parameters (RTP) and so on
auto gHndl = xrt::graph(xF::gpDhdl, xF::xclbin_uuid, "filter_graph");
gHndl.reset();
gHndl.update("filter_graph.k1.in[1]", float2fixed_coeff<10, 16>(kData));
gHndl.run(1);
gHndl.wait();

4. Wait for kernel completion.
s2mm_rhdl.wait();

5. Sync output device buffer objects to host memory.

dst_hndl.sync(XCL_BO_SYNC_BO_FROM_DEVICE);

//6. post-processing on host memory - "host_out

Vitis Vision AIE library functions provide optimal vector implementations of various computer vision algorithms. These functions are expected to process high resolution images. However because local memory in AIE core modules is limited, entire images can’t be fit into it. Also accessing DDR for reading / writing image data will be highly inefficient both for performance and power. To overcome this limitation, host code is expected to split the high resolution image into smaller tiles which fit in the AIE Engine local memory in ping-pong fashion. Splitting high resolution images into smaller tiles is a complex operation as it need to be aware of overlap regions and borders. Also the tile size is expected to be aligned with vectorization factor of the kernel.

To facilitate this the Vitis Vision Library provides data movers which perform smart tiling / stitching of high resolution images which can meet all the above requirements. There are two versions made available which can provide data movement capabilities both using PLIO and GMIO interfaces. A high-level class abstraction is provided with a simple API interface to facilitate data transfers. The class abstraction allows for seamless transition between the PLIO - GMIO methods of data transfers.