All OpenCL enqueue-based API calls are
asynchronous. These commands will return immediately after the command is enqueued in the
command queue. To pause the host program to wait for results, or resolve any dependencies
among the commands, an API call such as
clFinish
or
clWaitForEvents
can be used to block
execution of the host program.
The following code shows examples for clFinish
and clWaitForEvents
.
err = clEnqueueTask(command_queue, kernel, 0, NULL, NULL);
// Execution will wait here until all commands in the command queue are finished
clFinish(command_queue);
// Create event, read memory from device, wait for read to complete, verify results
cl_event readevent;
// host memory for output vector
int host_mem_output_ptr[MAX_LENGTH];
//Enqueue ReadBuffer, with associated event object
clEnqueueReadBuffer(command_queue, dev_mem_ptr, CL_TRUE, 0, sizeof(int) * number_of_words,
host_mem_output_ptr, 0, NULL, &readevent );
// Wait for clEnqueueReadBuffer event to finish
clWaitForEvents(1, &readevent);
// After read is complete, verify results
...
Note how the commands are used in the example above:
- The
clFinish
API has been explicitly used to block the host execution until the kernel execution is finished. This is necessary otherwise the host can attempt to read back from the FPGA buffer too early and may read garbage data. - The data transfer from FPGA memory to the local host machine is done
through
clEnqueueReadBuffer
. Here the last argument ofclEnqueueReadBuffer
returns an event object that identifies this particular read command, and can be used to query the event, or wait for this particular command to complete. TheclWaitForEvents
command specifies a single event (the readevent), and waits to ensure the data transfer is finished before verifying the data.