Transferring Data between Software and PL Kernels - 2025.2 English - UG1700

Data Center Acceleration Using Vitis User Guide (UG1700)

Document ID
UG1700
Release Date
2025-11-20
Version
2025.2 English

Transferring data to and from the memory in the accelerator card or device uses the buffer objects (xrt::bo) created when Setting Up XRT-Managed Kernels and Kernel Arguments.

The class constructor typically allocates a regular 4K aligned buffer object. The following code creates regular buffer objects that have a software application backing pointer allocated by user space in heap memory, and a device-side buffer allocated in the memory bank associated with the kernel argument (krnl.group_id). Optional flags in the xrt::bo constructor let you create non-standard types of buffers for use in special circumstances as described in Creating Special Buffers.

std::cout << "Allocate Buffer in Global Memory\n";
auto bo0 = xrt::bo(device, vector_size_bytes, krnl.group_id(0));
auto bo1 = xrt::bo(device, vector_size_bytes, krnl.group_id(1));
auto bo_out = xrt::bo(device, vector_size_bytes, krnl.group_id(2));
Important: A single buffer cannot be bigger than 4 GB, yet to maximize throughput from the host to global memory, AMD also recommends keeping the buffer size at least 2 MB if possible.

With the buffer established, and filled with data, there are a number of methods to enable transfers between the software application and the kernel, as described below:

Using xrt::bo::sync()
Use xrt::bo::sync to sync data from the host to the device with XCL_BO_SYNC_TO_DEVICE flag, or from the device to the host with XCL_BO_SYNC_FROM_DEVICE flag using xrt::bo::write, or xrt::bo::read to write the buffer from the host application, or read the buffer from the device.
bo0.write(buff_data);
bo0.sync(XCL_BO_SYNC_BO_TO_DEVICE);
bo1.write(buff_data);
bo1.sync(XCL_BO_SYNC_BO_TO_DEVICE);
...
bo_out.sync(XCL_BO_SYNC_BO_FROM_DEVICE);
bo_out.read(buff_data);
Note: If the buffer is created using a user-pointer as described in Creating Buffers from User Pointers, the xrt::bo::sync call is sufficient, and the xrt::bo::write or xrt::bo::read commands are not required.
Using xrt::bo::map()
This method maps the host-side buffer backing pointer to a user pointer.
// Map the contents of the buffer object into host memory
auto bo0_map = bo0.map<int*>();
auto bo1_map = bo1.map<int*>();
auto bo_out_map = bo_out.map<int*>();

The software code can subsequently exercise the user pointer for data reads and writes. However, after writing to the mapped pointer (or before reading from the mapped pointer), use the xrt::bo::sync() command with the required direction flag for the DMA operation.

for (int i = 0; i < DATA_SIZE; ++i) {
   bo0_map[i] = i;
   bo1_map[i] = i;
}

// Synchronize buffer content with device side
bo0.sync(XCL_BO_SYNC_BO_TO_DEVICE);
bo1.sync(XCL_BO_SYNC_BO_TO_DEVICE);

There are additional buffer types and transfer scenarios supported by the XRT native API, as described in Miscellaneous Other Buffers.