Versal Adaptive SoC AIE-ML Architecture Manual (AM020)

Document ID
Release Date
1.3 English

There are two trace streams coming out of each AIE-ML tile. One stream from the AIE-ML and the other from the memory module. Both these streams are connected to the tile stream switch. There is a trace unit in each AIE-ML module and memory module in an AIE-ML tile, and an AIE-ML programmable logic (PL) module in an AIE-ML PL interface tile (see types of array interface tiles). The units can operate in the following modes:

  • AIE-ML modes
    • Event-time
    • Event-PC
    • Execution-trace
  • AIE-ML memory module mode
    • Event-time
  • AIE-ML PL module mode
    • Event-time

The trace is output from the unit through the AXI4-Stream as an AIE-ML packet-switched stream packet. The packet size is 8x32 bits, including one word of header and seven words of data. The information contained in the packet header is used by the array AXI4-Stream switches to route the packet to any AIE-ML destination it can be routed to, including AIE-ML local data memory through the AIE-ML tile DMA, external DDR memory through the AIE-ML array interface DMA, and block RAM or UltraRAM through the AIE-ML to PL AXI4-Stream.

The event-time mode tracks up to eight independent numbered events on a per-cycle basis. A trace frame is created to record state changes in the tracked events. The frames are collected in an output buffer into an AIE-ML packet-switched stream packet. Multiple frames can be packed into one 32-bit stream word but they cannot cross a 32-bit boundary (filler frames are used for 32-bit alignment).

In the event-PC mode, a trace frame is created each cycle where any one or more of the eight watched events are asserted. The trace frame records the current program counter (PC) value of the AIE-ML together with the current value of the eight watched events. The frames are collected in an output buffer into an AIE-ML packet-switched stream packet.

The trace unit in the AIE-ML can operate in execution-trace mode. In real time, the unit will send, via the AXI4-Stream, a minimum set of information to allow an offline debugger to reconstruct the program execution flow. This assumes the offline debugger has access to the ELF. The information includes:

  • Conditional and unconditional direct branches
  • All indirect branches
  • Zero-overhead-loop LC

The AIE-ML generates the packet-based execution trace, which can be sent over the 32-bit wide execution trace interface. The following figure shows the logical view of trace hardware in the AIE-ML tile. The two trace streams out of the tile are connected internally to the event logic, configuration registers, broadcast events, and trace buffers.

Note: The different operating modes between the two modules are not shown.
Figure 1. Logical View of AIE-ML Trace Hardware

To control the trace stream for an event trace, there is a 32-bit trace_control0/1 register to start and stop the trace. There are also the trace_event0/1 registers to program the internal event number to be added to the trace.