The
clCreateCommandQueue
API
creates one or more command queues for each device. The FPGA can contain multiple
kernels, which can be either the same or different kernels. When developing the host
application, there are two main programming approaches to execute kernels on a device:
- Single out-of-order command queue: Multiple kernel executions can be requested through the same command queue. XRT dispatches kernels as soon as possible, in any order, allowing concurrent kernel execution on the FPGA.
- Multiple in-order command queue: Each kernel execution is requested from different in-order command queues. In such cases, XRT dispatches kernels from the different command queues, improving performance by running them concurrently on the device.
Recommended: For improved
performance, AMD recommends using a single out-of-order command
queue and manage event dependencies and synchronizations explicitly, instead of using
multiple command queues.
The following is an example of standard API calls to create in-order and out-of-order command queues.
// Out-of-order Command queue
commands = clCreateCommandQueue(context, device_id, CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE, &err);
// In-order Command Queue
commands = clCreateCommandQueue(context, device_id, 0, &err);