When profiling is enabled in xrt.ini with
opencl_trace
there is operational overhead added
and events in the timeline can show longer delay in between events. However, XRT
provides a no-overhead option to dump the OpenCL
events on the timeline. It is a simple method of displaying events on the timeline
without any overhead. Tip: You cannot specify any host
side profiling or this overrides the no-overhead approach. However, you can use this
approach with
device_trace
, which profiles the device but does not
add overhead to the host application. Because the goal is to provide visibility into events with absolutely zero overhead, there are limitations to the number of events that can be logged. Additionally, there is no command queue information in this view, so this view is not intended as a replacement of the more detailed Timeline Trace.
You can use the "no overhead" view to confirm OpenCL command dependencies and to observe actual event overhead for the command execution from the host application.
Add the following switch in xrt.ini to
enable OpenCL events. The beginning and end of event
capturing can be controlled as shown below:
[Debug]
xocl_debug=true
#xocl_event_begin= 0 (default)
#xocl_event_end=1000 (default)
By default only 1000 events can be visualized.
After the run, if no
xrt.run_summary
is generated, you can use
the following steps to generate a .wdb file to view
in Vitis Analyzer:
vp_analyze xocl -i xocl.log // generates debug_log.csv
vp_analyze trace -i debug_log.csv // generates debug_log.wdb
vitis_analyzer debug_log.wdb // loads the wdb file in Vitis analyzer
Figure 1. No Overhead Timeline
Trace Information Captured | Profile Overhead | Use Case | xrt.ini Switches |
---|---|---|---|
Complete | High | For debug purposes like the early stage of application development. |
opencl_trace = true
|
Partial | Low | When profile overhead is High and unexpected delay is experienced. |
lop_trace = true
|
Minimum | No | Only to confirm the cause of delay between events. |
xocl_debug = true
|