Stage 4: AI Engine Event Trace and Analysis - 2022.1 English

Versal ACAP AI Engine Programming Environment User Guide (UG1076)

Document ID
Release Date
2022.1 English

The goal of this stage is to determine the AI Engine kernel or graph construct causing design performance drop or stall, or causing a deadlock.

The following figure shows the tasks and techniques available in this stage.

Figure 1. AI Engine Event Trace and Analysis

The sections below list the different debug techniques available in this design stage.

Running and Analyzing Runtime Trace Data Using AI Engine Event Trace Flow

AI Engine Event Trace feature offers a comprehensive view of design trace data when running the design in hardware. This is a three part process where you:
  1. Compile the design with event trace enabled and other event trace related options.
  2. Run the design in hardware and collect event trace data.
  3. Open the trace summary file in Vitis™ analyzer, which provides a waveform view of the trace data collected above.
Event trace data lets you identify the AI Engine kernel that contributes to a stall, deadlock or throughput drop, and also view events prior to a stall/throughput drop in addition to other detailed trace information. Details on the event trace feature can be found in Event Tracing in Hardware.
Figure 2. Event Tracing

For detailed resolution to specific techniques encountered running event trace in hardware, see Troubleshooting Event Trace in Hardware. The feature is limited by the event trace counters, streams, DDR memory and design resources available for event trace in the device.

Profiling Intra-Kernel Performance

You can also profile code blocks inside a specific kernel using aie::tile::cycles() API.

To get this value in hardware, you can write this value to memory or to an output stream. An example of writing to output stream is shown below. This stream of data can then be examined in the host application to read back the profile data.

// get the current tile
aie::tile tile=aie::tile::current();
unsigned long long time=tile.cycles(); //cycle counter of [SS1] the AI Engine tile
{//loop to be profiled
time=tile.cycles();//cycle counter of the AI Engine tile

This is a very intrusive method of profiling kernel code. Xilinx recommends that you use this method to simulate the graph with the AI Engine Simulator. In addition, trace and profile data in simulation can also be used for this purpose.

For details on the aie::tile::cycles() API, see AI Engine Kernel Coding Best Practices Guide (UG1079) .

Vitis IDE Debugger

You can also use the Vitis IDE debugger to debug kernel source code. Details on the Vitis Debugger can be found in Debugging the AI Engine Application.

Next Stage: After you determine the cause of throughput drop and fix the issue, proceed to stage 1 to rerun the design.