- Compile the graph with
--event-traceand other appropriate flags.An example of the AI Engine compiler command for event tracing is as follows:
v++ --c --mode aie --verbose --pl-freq=100 --workdir=./myWork \ --event-trace-port=gmio --event-trace=runtime \ --num-trace-streams=8 --xlopt=0 --include="./" \ --include="./src" --include="./src/kernels" --include="./data" \ ./src/graph.cppFor examples using the unified command line interface, see v++ Mode AI Engine in the Vitis Reference Guide (UG1702).
Note:- The preceding example illustrates compiling the design with
--event-trace=runtimeconfiguration. When you use the--event-trace=runtimeconfiguration option, you can configure the type of events that AI Engine captures during runtime. - The
--event-trace-port=gmiooption uses GMIO to capture event trace data. This option uses the AI Engine-to-NoC event trace pathway. The alternative is to use PLIO which uses the AI Engine-PL pathway to capture event trace data. This uses programming logic resources to capture data from AI Engine to DDR.Recommended: AMD recommends using the gmio option as the event-trace-port configuration. Using the gmio option avoids using PL resources and timing errors caused by the PL resource usage.-
gmiois the AI Engine NoC event pathway. GMIO is the default event trace port configuration. -
pliois the AI Engine to PL event trace pathway. Event data transfers to the PL which stores it within block RAM and URAM resources. This PL resource add-on can induce timing errors on the hardware design side.
-
- The option
--num-trace-streams=8specifies the number of streams used by the AI Engine array to extract the trace events. The default value is 16 trace streams which is also the maximum valid value.
- The preceding example illustrates compiling the design with
- Compile and link the design using the Vitis compiler.
After compiling the AI Engine graph application, you must build the other elements of the system. See Building and Running the System in the Data Center Acceleration using Vitis (UG1700) for more information. With
--event-traceenabled in the libadf.a file from the AI Engine compiler, the system hardware generated by the Vitis compiler includes the following elements:- Compiled ELF file for the PS application
- Compiled ELF files for the AI Engine processors
- XCLBIN file for the PL
These are the elements you need to run the system on hardware.
- After linking to create the device binary, run the Vitis compiler
--packagestep to create the sd_card folder and files needed to boot the device. See Integrate System in the Embedded Design Development Using Vitis (UG1701) for more information.This step packages everything you need to build the
BOOT.BINfile for the system. When packaging the boot files for the device, you must specify the--package.defer_aie_runto load the AI Engine application with the ELF file. Do not run it untilgraph.rundirects it. See Graph Execution Control in the AI Engine Kernel and Graph Programming Guide (UG1079) for more information.
When compiling the AI Engine design with the option --event-trace=runtime, you must define the type of data to be traced
at runtime. You can trace each element of the AI Engine Array:
- AI Engine Tile: AI Engine and Memory Module
- Interface Tile
- Memory Tile
| Metric Name | Description |
|---|---|
| functions | Basic time line of function activity: events generated when kernel functions are being invoked and returned |
| partial_stalls | The system registers three types of core stalls: stream stalls (no data at input or back-pressure at output), cascade stalls and lock stalls. |
| all_stalls | Same as partial_stalls with memory_stalls (memory conflict) added. |
| all_dma | Data transfers of all four Memory DMA channels (2xS2MM, 2xMM2S) |
| all_stalls_dma | Core stalls and data transfers of all four DMA channels. All core stalls are grouped, no differentiation on the type of stall. |
| all_stalls_s2mm | Core stalls and data transfer of two S2MM channels 1 |
| all_stalls_mm2s | Core stalls and data transfer of two MM2S channels 1 |
| s2mm_channels | Data transfers and stalls of two S2MM channels |
| mm2s_channels | Data transfers and stalls of two MM2S channels |
| s2mm_channels_stall | Details of one S2MM channel. 2 In AI Engine-ML v2-based devices only |
| mm2s_channels_stall | Details of one MM2S channel 2 . In AI Engine-ML v2 based devices only |
|
|
| Metric Name | Description |
|---|---|
| input_ports | Data transfers of 4 stream input from the AI Engine Array |
| input_port_stalls | Data transfers and stalls of 2 inputs from the AI Engine Array |
| input_port_details | Details on one MM2S channel 1. For GMIOs only |
| output_port | Data transfers of 4 stream output to the AI Engine Array |
| output_port_stalls | Data transfers and stalls of 2 inputs to the AI Engine Array |
| output_port_details | Details on one S2MM channel. Includes Buffer Descriptors, tasks, starvation, back-pressure and lock stalls. For GMIOs only |
| input_output_ports | Data transfers of 4 inputs or outputs of AI Engine Array |
| input_output_ports_stalls | Data transfers and stalls of 2 inputs or output of the AI Engine Array |
| Metric Name | Description |
|---|---|
| s2mm_channels | Buffer Descriptor and Task events for two S2MM channels |
| s2mm_channels_stalls | Details on one S2MM channels, adding lock stalls, back-pressure and stream starvation. |
| mm2s_channels | Buffer Descriptor and Task events for 2 MM2S channels |
| mm2s_channels_stalls | Details on one MM2S channel, adding lock stalls, back-pressure and stream starvation. |
| memory_conflicts1 | Memory conflict for data memory banks 0-7 |
| memory_conflicts2 | Memory conflicts for data memory bank 8-15 |
- AI Engine
- AI Engine Memory
- Interface Tile
- Memory Tile (AI Engine-ML and AI Engine-ML v2)