To obtain trace data during hardware run, there must be routes dedicated to driving trace data from the AI Engine array to the PL or to the DDR. For this reason, during the graph compilation phase, you need to specify the trace data during hardware run and the interface to be used. See the following code for details.
v++ --c --mode aie --verbose --pl-freq=100 --workdir=./myWork \
--event-trace-port=gmio --event-trace=runtime \
--num-trace-streams=8 --xlopt=0 --include="./" \
--include="./src" --include="./src/kernels" --include="./data" \
./src/graph.cpp
- For
--event-trace=runtime: the only possibility here is runtime, indicating that signal selection will be decided at runtime. - For
--event-trace-port=plio/gmio: selects GMIO and the NoC pathway instead of PLIO/PL pathway. PLIO uses PL logic, which can induce timing closure difficulties. - For
--num-trace-streams=8: up to 16 streams can be used within the AIE Engine array to drive the trace events to the GMIO/PLIO.
For the profiling flow, you can perform event trace using either XSDB or XRT flow.
The metrics for the array are described below.
| Metric Name | Description |
|---|---|
| functions | Basic time line of function activity: events generated when kernel functions are being invoked and returned |
| partial_stalls | Three types of core stalls are being registered: stream stalls (no data at input or back-pressure at output), cascade stalls and lock stalls. |
| all_stalls | Same as partial_stalls with memory_stalls (memory conflict) added. |
| all_dma | Data transfers of all 4 Memory DMA channels (2xS2MM, 2xMM2S) |
| all_stalls_dma | Core stalls and data transfers of all 4 DMA channels. All core stalls are grouped, no differentiation on the type of stall. |
| all_stalls_s2mm | Core stalls and data transfer of two S2MM channels 1 |
| all_stalls_mm2s | Core stalls and data transfer of two MM2S channels 1 |
| s2mm_channels | Data transfers and stalls of two S2MM channels |
| mm2s_channels | Data transfers and stalls of two MM2S channels |
| s2mm_channels_stall | Details of one S2MM channel. 2 In AI Engine-ML v2 based devices only |
| mm2s_channels_stall | Details of one MM2S channel 2 . In AI Engine-ML v2 based devices only |
|
|
| Metric Name | Description |
|---|---|
| input_ports | Data transfers of 4 stream input from the AI Engine Array |
| input_port_stalls | Data transfers and stalls of 2 inputs from the AI Engine Array |
| input_port_details | Details on one MM2S channel 1. For GMIOs only |
| output_port | Data transfers of 4 stream output to the AI Engine Array |
| output_port_stalls | Data transfers and stalls of 2 outputs to the AI Engine Array |
| output_port_details | Details on one S2MM channel. Includes Buffer Descriptors, tasks, starvation, back-pressure and lock stalls. For GMIOs only |
| input_output_ports | Data transfers of 4 inputs or outputs of AI Engine Array |
| input_output_ports_stalls | Data transfers and stalls of 2 inputs or output of the AI Engine Array |
| Metric Name | Description |
|---|---|
| s2mm_channels | Buffer Descriptor and Task events for two S2MM channels |
| s2mm_channels_stalls | Details on one S2MM channels, adding lock stalls, back-pressure and stream starvation. |
| mm2s_channels | Buffer Descriptor and Task events for 2 MM2S channels |
| mm2s_channels_stalls | Details on one MM2S channel, adding lock stalls, back-pressure and stream starvation. |
| memory_conflicts1 | Memory conflict for data memory banks 0-7 |
| memory_conflicts2 | Memory conflicts for data memory bank 8-15 |
XSDB Flow
xsdb. This
command is typically used to program the device and debug bare-metal applications.
Connect your system to the hardware platform or device over JTAG, launch the xsdb command in a command shell, and run the following
sequence of
commands:xsdb% connect
xsdb% ta
xsdb% ta 1
xsdb% source $::env(XILINX_VITIS)/scripts/vitis/util/aie_trace.tcl​
xsdb% aietrace start -graphs mygraph -work-dir ./Work -link-summary $PROJECT/xsa.link_summary -base-address 0x900000000 -depth 0x800000 -tile-based-aie-tile-metrics "all:functions; {4,1}:{6,2}:all_stalls"
# Execute the PS host application (.elf) on Linux
## After the application completes processing.
xsdb% aietrace stop
where, the source
$::env(XILINX_VITIS)/scripts/vitis/util/aie_trace.tcl command sources
the Tcl trace command to set up the xsdb environment.
vitis -a aie_trace_profile.run_summaryFor
more details on this flow, see the chapters on Event Tracing in Hardware and XSDB
flow in the
AI
Engine Tools and Flows User Guide (UG1076).XRT Flow
xrt.ini file in the SDCard. An example of such
an xrt.ini file is shown
hereafter:# Main switch to turn on aie trace
[Debug]
aie_trace = true
# Continuous trace knobs
[AIE_trace_settings]
reuse_buffer = true
periodic_offload = true
# Time to wait between trace reads
buffer_offload_interval_us = 100
# Total amount of device memory shared between trace streams
buffer_size = 16M
# granularity
graph_based_aie_tile_metrics = all:all:functions
For more details, see the chapters on Event Tracing in Hardware and XRT Flow in the AI Engine Tools and Flows User Guide (UG1076).