Transferring Data between Software and PL Kernels

Transferring Data between Software and PL Kernels - 2023.2 English

Vitis Unified Software Platform Documentation: Application Acceleration Development (UG1393)

Document ID

UG1393

Release Date

2023-12-13

Version

2023.2 English

Transferring data to and from the memory in the accelerator card or device uses the buffer objects (xrt::bo) created when Setting Up XRT-Managed Kernels and Kernel Arguments.

The class constructor typically allocates a regular 4K aligned buffer object. The following code creates regular buffer objects that have a software application backing pointer allocated by user space in heap memory, and a device-side buffer allocated in the memory bank associated with the kernel argument (krnl.group_id). Optional flags in the xrt::bo constructor let you create non-standard types of buffers for use in special circumstances as described in Creating Special Buffers.

std::cout << "Allocate Buffer in Global Memory\n";
auto bo0 = xrt::bo(device, vector_size_bytes, krnl.group_id(0));
auto bo1 = xrt::bo(device, vector_size_bytes, krnl.group_id(1));
auto bo_out = xrt::bo(device, vector_size_bytes, krnl.group_id(2));

Important: A single buffer cannot be bigger than 4 GB, yet to maximize throughput from the host to global memory, AMD also recommends keeping the buffer size at least 2 MB if possible.

With the buffer established, and filled with data, there are a number of methods to enable transfers between the software application and the kernel, as described below:

Using xrt::bo::sync()

Use xrt::bo::sync to sync data from the host to the device with XCL_BO_SYNC_TO_DEVICE flag, or from the device to the host with XCL_BO_SYNC_FROM_DEVICE flag using xrt::bo::write, or xrt::bo::read to write the buffer from the host application, or read the buffer from the device.

bo0.write(buff_data);
bo0.sync(XCL_BO_SYNC_BO_TO_DEVICE);
bo1.write(buff_data);
bo1.sync(XCL_BO_SYNC_BO_TO_DEVICE);
...
bo_out.sync(XCL_BO_SYNC_BO_FROM_DEVICE);
bo_out.read(buff_data);

Note: If the buffer is created using a user-pointer as described in Creating Buffers from User Pointers, the xrt::bo::sync call is sufficient, and the xrt::bo::write or xrt::bo::read commands are not required.

Using xrt::bo::map()

This method maps the host-side buffer backing pointer to a user pointer.

// Map the contents of the buffer object into host memory
auto bo0_map = bo0.map<int*>();
auto bo1_map = bo1.map<int*>();
auto bo_out_map = bo_out.map<int*>();

The software code can subsequently exercise the user pointer for data reads and writes. However, after writing to the mapped pointer (or before reading from the mapped pointer) the xrt::bo::sync() command should be used with the required direction flag for the DMA operation.

for (int i = 0; i < DATA_SIZE; ++i) {
   bo0_map[i] = i;
   bo1_map[i] = i;
}

// Synchronize buffer content with device side
bo0.sync(XCL_BO_SYNC_BO_TO_DEVICE);
bo1.sync(XCL_BO_SYNC_BO_TO_DEVICE);

There are additional buffer types and transfer scenarios supported by the XRT native API, as described in Miscellaneous Other Buffers.