The AIE-ML tile architecture leverages the functionality and performance requirements from the AI Engine tile architecture. The following provides an overview of the changes made to the AIE-ML tile architecture:
- AIE-ML (see AIE-ML Processor for more information)
- Data memory:
- Data memory is increased from 32 KB to 64 KB organized as eight banks of 8 KB from a hardware perspective. From a programmer's perspective, every two banks are interleaved to form one bank, that is, a total of four banks of 16 KB. The AIE-ML tile has access to the four nearest memory modules in the cardinal directions: north, south, east (local data memory in the tile itself) and west.
- Added memory zero-init
- DMA:
- Features an improved address generation to support 3D addressing modes and iteration-state offset
- Adds task queues and task-complete-tokens (see Task-Completion-Tokens for more information)
- Supports S2MM Finish on TLAST and out-of-order packets
- Added decompression to two S2MM channels
- Added compression to two MM2S channels
- Memory-mapped AXI4 interface: improved read and write bandwidth
- Lock module: 16 semaphore locks and each lock state is 6-bit unsigned versus 16 locks with binary data value in the AI Engine.