The AXI4-Stream data path between the programmable logic and the AI Engine array includes multiple buffering points. Each buffering stage contributes to the overall data path latency and backpressure behavior. The the typical buffering stages are as follows:
- PL to AI Engine asynchronous FIFO: 12-deep, used for clock domain crossing
- Stream switch for each traversed tile or interface tile:
- 4-deep FIFO, 2-cycle latency per port (master and slave)
- Combined total of 8-deep buffering and 4-cycle latency per stream switch
- Optional chainable FIFO in stream switches: 16-deep, can be inserted along the path to increase buffering
PL to AI Engine interfaces can be
enabled or disabled through a configuration register. All interfaces are disabled at
reset. When an interface is disabled, no data flows from the PL into the AI Engine array. When an interface is enabled,
data can flow into the 12-deep clock domain crossing FIFO even if stream routing is not
yet configured. No data is lost if the PL master is AXI4-Stream compliant and stops sending data when TREADY is deasserted.
Figure 1. Typical Buffering Stages