Webp Input Arguments:
Usage: cwebp -[-use_ocl -q -o] -xclbin : the kernel file list.rst: the input list -use_ocl: should be kept -q: compression quality -o: output directory
Compared to original command-line parameter, there are three differences here. The first is ‘-xclbin’ for specifying the kernel files. The second is a change for input image file which is replaced by a file list file in which more than one input images are listed line by line. The third, the ‘-use_ocl’ is used for enable vitis flow.
The following figure shows the host information when run on board. The time listed in the figure is not accurate.
./cwebp -xclbin kernel.xclbin list.rst -use_ocl -q 80 -o ./images INFO: CreateKernel start. INFO: Number of Platforms: 1 INFO: Selected Platform: Xilinx INFO: Number of devices for platform 0: 2 INFO: target_device found: xilinx_u200_gen3x16_xdma_base_2 INFO: target_device chosen: xilinx_u200_gen3x16_xdma_base_2 Info: Context created Info: Command queue created INFO: OpenCL Version: 1.-48 INFO: Loading kernel.xclbin INFO: Loading kernel.xclbin Finished Info: Program created Info: Kernel created Info: Kernel created INFO: CreateKernel finished. Computation time is 328.504000 (ms) INFO: Create buffers started. INFO: Create buffers finished. Computation time is 48.225000 (ms) INFO: WebPEncodeAsync Starts... INFO: Nloop = 1 INFO: VP8EncTokenLoopAsync starts ... *** Picture: 1 - 1, Buffer: 0, Instance: 0, Event: 0 *** HtoD webpen.c INFO: Host2Device finished. Computation time is 0.874000 (ms) INFO: PredKernel Finished. Computation time is 0.258000 (ms) INFO: ACKernel Finished. Computation time is 0.155000 (ms) INFO: Device2Host finished. Computation time is 0.118000 (ms) INFO: Loop of Pictures Finished. Computation time is 17.825000 (ms) INFO: VP8EncTokenLoopAsync Finished. Computation time is 24.683000 (ms) INFO: WebPEncodeAsync Finished. Computation time is 31.885000 (ms) INFO: Release Kernel. Info: Test passed
To get the accurate kernel execution time, please add a file “xrt.ini”, and fill this file with following directives.
#Start of Debug group [Debug] profile=true timeline_trace=true data_transfer_trace=fine app_debug=true opencl_summary=true opencl_trace=true #Start of Runtime group [Runtime] runtime_log = console
Kernel Execution Kernel,Number Of Enqueues,Total Time (ms),Minimum Time (ms),Average Time (ms),Maximum Time (ms), webp_2_ArithmeticCoding_1,1,2.95381,2.95381,2.95381,2.95381, webp_IntraPredLoop2_NoOut_1,1,3.61861,3.61861,3.61861,3.61861,
For more information about how to analyze performance, please refer to Application Acceleration Development (UG1393)