Event Trace Build Flow - 2025.2 English - UG1076

AI Engine Tools and Flows User Guide (UG1076)

Document ID
UG1076
Release Date
2025-11-20
Version
2025.2 English
The event trace build flow is as follows:
  1. Compile the graph with --event-trace and other appropriate flags.

    An example of the AI Engine compiler command for event tracing is as follows:

    v++ --c --mode aie --verbose --pl-freq=100 --workdir=./myWork \
    --event-trace-port=gmio --event-trace=runtime \
    --num-trace-streams=8 --xlopt=0 --include="./" \ 
    --include="./src" --include="./src/kernels" --include="./data" \
    ./src/graph.cpp

    For examples using the unified command line interface, see v++ Mode AI Engine in the Vitis Reference Guide (UG1702).

    Note:
    • The preceding example illustrates compiling the design with--event-trace=runtime configuration. When you use the --event-trace=runtime configuration option, you can configure the type of events that AI Engine captures during runtime.
    • The --event-trace-port=gmio option uses GMIO to capture event trace data. This option uses the AI Engine-to-NoC event trace pathway. The alternative is to use PLIO which uses the AI Engine-PL pathway to capture event trace data. This uses programming logic resources to capture data from AI Engine to DDR.
      • gmio is the AI Engine NoC event pathway. GMIO is the default event trace port configuration.
      • plio is the AI Engine to PL event trace pathway. Event data transfers to the PL which stores it within block RAM and URAM resources. This PL resource add-on can induce timing errors on the hardware design side.
      Note: Use the compiler option --graph-iterator-event to delay the event trace data based on the graph iteration. For more information, see Table 1 in XSDB Flow, or Table 1 in XRT Flow.
    • The option --num-trace-streams=8 specifies the number of streams used by the AI Engine array to extract the trace events. The default value is 16 trace streams which is also the maximum valid value.
  2. Compile and link the design using the Vitis compiler.

    After compiling the AI Engine graph application, you must build the other elements of the system. See Building and Running the System in the Data Center Acceleration using Vitis (UG1700) for more information. With --event-trace enabled in the libadf.a file from the AI Engine compiler, the system hardware generated by the Vitis compiler includes the following elements:

    • Compiled ELF file for the PS application
    • Compiled ELF files for the AI Engine processors
    • XCLBIN file for the PL

    These are the elements you need to run the system on hardware.

  3. After linking to create the device binary, run the Vitis compiler --package step to create the sd_card folder and files needed to boot the device. See Integrate System in the Embedded Design Development Using Vitis (UG1701) for more information.

    This step packages everything you need to build the BOOT.BIN file for the system. When packaging the boot files for the device, you must specify the --package.defer_aie_run to load the AI Engine application with the ELF file. Do not run it until graph.run directs it. See Graph Execution Control in the AI Engine Kernel and Graph Programming Guide (UG1079) for more information.

When compiling the AI Engine design with the option --event-trace=runtime, you must define the type of data to be traced at runtime. You can trace each element of the AI Engine Array:

  • AI Engine Tile: AI Engine and Memory Module
  • Interface Tile
  • Memory Tile
Table 1. AI Engine Metrics
Metric Name Description
functions Basic time line of function activity: events generated when kernel functions are being invoked and returned
partial_stalls The system registers three types of core stalls: stream stalls (no data at input or back-pressure at output), cascade stalls and lock stalls.
all_stalls Same as partial_stalls with memory_stalls (memory conflict) added.
all_dma Data transfers of all four Memory DMA channels (2xS2MM, 2xMM2S)
all_stalls_dma Core stalls and data transfers of all four DMA channels. All core stalls are grouped, no differentiation on the type of stall.
all_stalls_s2mm Core stalls and data transfer of two S2MM channels 1
all_stalls_mm2s Core stalls and data transfer of two MM2S channels 1
s2mm_channels Data transfers and stalls of two S2MM channels
mm2s_channels Data transfers and stalls of two MM2S channels
s2mm_channels_stall Details of one S2MM channel. 2 In AI Engine-ML v2-based devices only
mm2s_channels_stall Details of one MM2S channel 2 . In AI Engine-ML v2 based devices only
  1. In AI Engine based devices, the stall events are concatenated into a group stall event.
  2. Includes Buffer Descriptors, tasks, starvation, back-pressure and lock stalls.
Table 2. Interface Tiles
Metric Name Description
input_ports Data transfers of 4 stream input from the AI Engine Array
input_port_stalls Data transfers and stalls of 2 inputs from the AI Engine Array
input_port_details Details on one MM2S channel 1. For GMIOs only
output_port Data transfers of 4 stream output to the AI Engine Array
output_port_stalls Data transfers and stalls of 2 inputs to the AI Engine Array
output_port_details Details on one S2MM channel. Includes Buffer Descriptors, tasks, starvation, back-pressure and lock stalls. For GMIOs only
input_output_ports Data transfers of 4 inputs or outputs of AI Engine Array
input_output_ports_stalls Data transfers and stalls of 2 inputs or output of the AI Engine Array
Table 3. Memory Tiles (AI Engine-ML and AI Engine-ML v2)
Metric Name Description
s2mm_channels Buffer Descriptor and Task events for two S2MM channels
s2mm_channels_stalls Details on one S2MM channels, adding lock stalls, back-pressure and stream starvation.
mm2s_channels Buffer Descriptor and Task events for 2 MM2S channels
mm2s_channels_stalls Details on one MM2S channel, adding lock stalls, back-pressure and stream starvation.
memory_conflicts1 Memory conflict for data memory banks 0-7
memory_conflicts2 Memory conflicts for data memory bank 8-15
Note: Each array element supports only one Profile metric and one Trace metric:
  • AI Engine
  • AI Engine Memory
  • Interface Tile
  • Memory Tile (AI Engine-ML and AI Engine-ML v2)