In the Vitis core development kit, sometimes
devices contain multiple kernel instances of a single kernel or of different kernels.
While the OpenCL API
clCreateSubDevices
allows the
host code to divide a device into multiple sub-devices, the Vitis core development kit supports equally divided sub-devices (using
CL_DEVICE_PARTITION_EQUALLY
), each containing one
kernel instance.
The following example shows:
- Sub-devices created by equal partition to execute one kernel instance per sub-device.
- Iterating over the sub-device list and using a separate context and command queue to execute the kernel on each of them.
- The API related to kernel execution (and corresponding buffer related) code is not shown for the sake of simplicity, but would be described inside the function
run_cu
.
cl_uint num_devices = 0;
cl_device_partition_property props[3] = {CL_DEVICE_PARTITION_EQUALLY,1,0};
// Get the number of sub-devices
clCreateSubDevices(device,props,0,nullptr,&num_devices);
// Container to hold the sub-devices
std::vector<cl_device_id> devices(num_devices);
// Second call of clCreateSubDevices
// We get sub-device handles in devices.data()
clCreateSubDevices(device,props,num_devices,devices.data(),nullptr);
// Iterating over sub-devices
std::for_each(devices.begin(),devices.end(),[kernel](cl_device_id sdev) {
// Context for sub-device
auto context = clCreateContext(0,1,&sdev,nullptr,nullptr,&err);
// Command-queue for sub-device
auto queue = clCreateCommandQueue(context,sdev,
CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE,&err);
// Execute the kernel on the sub-device using local context and
queue run_cu(context,queue,kernel); // Function not shown
});
Important: As shown in the example, you
must create a separate context for each sub-device. Though OpenCL supports a context that can hold multiple devices and sub-devices,
XRT requires each device and sub-device to have a separate context.