Because the kernel is running in programmable logic on the target
platform, optimizing your task to the environment is an important element of application
design. Most of the optimization techniques discussed in C/C++ Kernels can be applied to OpenCL kernels. Instead of applying the HLS pragmas used for C/C++
kernels, you will use the __attribute__
keyword
described in OpenCL Attributes. Following is an
example:
// Process the whole image
__attribute__((xcl_pipeline_loop))
image_traverse: for (uint idx = 0, x = 0 , y = 0 ; idx < size ; ++idx, x+= DATA_SIZE)
{
...
}
The example above specifies that the for
loop, image_traverse
, should be
pipelined to improve the performance of the kernel. The target II in this case is 1. For
more information, refer to xcl_pipeline_loop.
In the following code example, the watermark function uses the opencl_unroll_hint
attribute to let the Vitis compiler unroll the loop to reduce latency and
improve performance. However, in this case the __attribute__
is only a suggestion that the compiler can ignore if needed.
For details, refer to opencl_unroll_hint.
//Unrolling below loop to process all 16 pixels concurrently
__attribute__((opencl_unroll_hint))
watermark: for ( int i = 0 ; i < DATA_SIZE ; i++)
{
...
}
For more information, review the OpenCL Attributes topics to see what specific optimizations are supported for OpenCL kernels, and review the C/C++ Kernels content to see how these optimizations can be applied in your kernel design.