Stream Switch Buffering and Latency - Stream Switch Buffering and Latency - AM020

Versal Adaptive SoC AIE-ML Architecture Manual (AM020)

Document ID
AM020
Release Date
2026-02-18
Revision
1.5 English

The AXI4-Stream data path between the programmable logic and the AIE-ML array includes multiple buffering points. Each buffering stage contributes to the overall data path latency and backpressure behavior. The typical buffering stages are as follows:

  • PL to AIE-ML asynchronous FIFO: 12-deep, used for clock domain crossing
  • Stream switch for each traversed tile or interface tile:
    • 4-deep FIFO, 2-cycle latency per port (master and slave)
    • Combined total of 8-deep buffering and 4-cycle latency per stream switch
  • Optional chainable FIFO in stream switches: 16-deep, can be inserted along the path to increase buffering

PL to AIE-ML interfaces can be enabled or disabled through a configuration register. All interfaces are disabled at reset. When an interface is disabled, no data flows from the PL into the AIE-ML array. When an interface is enabled, data can flow into the 12-deep clock domain crossing FIFO even if stream routing is not yet configured. No data is lost if the PL master is AXI4-Stream compliant and stops sending data when TREADY is deasserted.

Figure 1. Typical Buffering Stages