The event trace build flow is as follows:
- Compile the graph with
--event-traceand other appropriate flags.An example of the AI Engine compiler command for event tracing is as follows:
v++ --c --mode aie --verbose --pl-freq=100 --workdir=./myWork \ --event-trace-port=gmio --event-trace=runtime \ --num-trace-streams=8 --xlopt=0 --include="./" \ --include="./src" --include="./src/kernels" --include="./data" \ ./src/graph.cppFor examples using the unified command line interface, see v++ Mode AI Engine in the Vitis Reference Guide (UG1702).
Note:- The preceding example illustrates compiling the
design with
--event-trace=runtimeconfiguration. When you use this option, you can configure the type of events that AI Engine captures during runtime. - The
--event-trace-port=gmiooption uses GMIO to capture event trace data. This option uses the AI Engine-to-NoC event trace pathway. The alternative is to use PLIO which uses the AI Engine-PL pathway to capture event trace data. This uses programming logic resources to capture data from AI Engine to DDR.Recommended: AMD recommends using gmio option as the event-trace-port configuration. This avoids the usage of programming logic resources and avoids timing errors caused by the PL resource usage.-
gmiois the AI Engine NoC event pathway. GMIO is the default event trace port configuration. -
pliois the AI Engine to PL event trace pathway. Event data get transferred to the PL where it is stored within BRAMs and URAMs resources. This PL resource add-on can induce timing errors on the hardware design side.
-
- The option
--num-trace-streams=8specifies the number of streams in the AI Engine array that are used to extract the trace events. The default value is 16 trace streams which is also the maximum valid value.
- The preceding example illustrates compiling the
design with
- Compile and link the design using the Vitis compiler.
After compiling the AI Engine graph application, you must build the other elements of the system, as described in Building and Running the System in the Data Center Acceleration using Vitis (UG1700) . With
--event-traceenabled in the libadf.a file from the AI Engine compiler, the system hardware generated by the Vitis compiler includes the compiled ELF file for the PS application, the compiled ELF files for the AI Engine processors, and the XCLBIN file for the PL. These are the elements you need to run the system on hardware. - After linking to create the device binary, run the Vitis compiler
--packagestep to create the sd_card folder and files needed to boot the device, as described in Integrate System in the Embedded Design Development Using Vitis (UG1701). This step packages everything needed to build theBOOT.BINfile for the system. When packaging the boot files for the device, you must also specify the--package.defer_aie_runto load the AI Engine application with the ELF file, but wait to run it untilgraph.rundirects it, as described in Graph Execution Control in the AI Engine Kernel and Graph Programming Guide (UG1079).
When compiling the AI Engine design with the option --event-trace=runtime,
you must define the type of data to be traced at runtime. Each element of the AI Engine Array can be traced:
- AI Engine Tile: AI Engine and Memory Module
- Interface Tile
- Memory Tile
| Metric Name | Description |
|---|---|
| functions | Basic time line of function activity: events generated when kernel functions are being invoked and returned |
| partial_stalls | Three types of core stalls are being registered: stream stalls (no data at input or back-pressure at output), cascade stalls and lock stalls. |
| all_stalls | Same as partial_stalls with memory_stalls (memory conflict) added. |
| all_dma | Data transfers of all four Memory DMA channels (2xS2MM, 2xMM2S) |
| all_stalls_dma | Core stalls and data transfers of all four DMA channels. All core stalls are grouped, no differentiation on the type of stall. |
| all_stalls_s2mm | Core stalls and data transfer of two S2MM channels 1 |
| all_stalls_mm2s | Core stalls and data transfer of two MM2S channels 1 |
| s2mm_channels | Data transfers and stalls of two S2MM channels |
| mm2s_channels | Data transfers and stalls of two MM2S channels |
| s2mm_channels_stall | Details of one S2MM channel. 2 In AI Engine-ML Gen 2 based devices only |
| mm2s_channels_stall | Details of one MM2S channel 2 . In AI Engine-ML Gen 2 based devices only |
|
|
| Metric Name | Description |
|---|---|
| input_ports | Data transfers of 4 stream input from the AI Engine Array |
| input_port_stalls | Data transfers and stalls of 2 inputs from the AI Engine Array |
| input_port_details | Details on one MM2S channel 1. For GMIOs only |
| output_port | Data transfers of 4 stream output to the AI Engine Array |
| output_port_stalls | Data transfers and stalls of 2 inputs to the AI Engine Array |
| output_port_details | Details on one S2MM channel. Includes Buffer Descriptors, tasks, starvation, back-pressure and lock stalls. For GMIOs only |
| input_output_ports | Data transfers of 4 inputs or outputs of AI Engine Array |
| input_output_ports_stalls | Data transfers and stalls of 2 inputs or output of the AI Engine Array |
| Metric Name | Description |
|---|---|
| s2mm_channels | Buffer Descriptor and Task events for two S2MM channels |
| s2mm_channels_stalls | Details on one S2MM channels, adding lock stalls, back-pressure and stream starvation. |
| mm2s_channels | Buffer Descriptor and Task events for 2 MM2S channels |
| mm2s_channels_stalls | Details on one MM2S channel, adding lock stalls, back-pressure and stream starvation. |
| memory_conflicts1 | Memory conflict for data memory banks 0-7 |
| memory_conflicts2 | Memory conflicts for data memory bank 8-15 |