The AI Engine has multiple
interfaces. The following block diagram shows the interfaces.
- Data Memory Interface
- The AI Engine can access data memory modules on all four directions. They are accessed as one contiguous memory. The AI Engine has two 256-bit wide load units and one 256-bit wide store unit. From the AI Engines perspective, the throughput of each of the loads (two) and store (one) is 256 bits per clock cycle.
- Program Memory Interface
- This 128-bit wide interface is used by the AI Engine to access the program memory. A new instruction can be fetched every clock cycle.
- Direct AXI4-Stream Interface
- The AI Engine has two 32-bit input AXI4-Stream interfaces and two 32-bit output AXI4-Stream interfaces. Each stream is connected to a FIFO both on the input and output side, allowing the AI Engine to have a 4 word (128-bit) access per 4 cycles, or a 1 word (32-bit) access per cycle on a stream.
- Cascade Stream Interface
- The 384-bit accumulator data from one AI Engine can be forwarded to another by using these cascade streams to form a chain. There is a small, two-deep, 384-bit wide FIFO on both the input and output streams that allow storing up to four values between AI Engines.
- Debug Interface
- This interface is able to read or write all AI Engine registers over the memory-mapped AXI4 interface.
- Hardware Synchronization (Locks) Interface
- This interface allows synchronization between two AI Engines or between an AI Engine and DMA. The AI Engine can access the lock modules in all four directions.
- Stall Handling
- An AI Engine can be stalled due to multiple reasons and from different sources. Examples include: external memory-mapped AXI4 master (for example, PS), lock modules, empty or full AXI4-Stream interfaces, data memory collisions, and event actions from the event unit.
- AI Engine Event Interface
- This 16-bit wide EVENT interface can be used to set different events.
- Tile Timer
- The input interface to read the 64-bit timer value inside the tile.
- Execution Trace Interface
- A 32-bit wide interface where the AI Engine generated packet-based execution trace can be sent over the AXI4-Stream.
Figure 1.
AI Engine
Interfaces