The AI Engine (AIE) is a two-dimensional array of computation, memory and interconnect resources connected to programmable logic and to the NoC.
Fig. 2: High level block diagram of the AI Engine ML array
Its purpose is to leverage its specialized processors to process heavy computations without requiring timing closure, and to spare programmable logic resources. The AI Engine currently comes in two versions, the AI Engine and the AI Engine - Machine Learning (AIE-ML), that is a new model of the component that has been introduced to support the heavy computation and memory loads needed by machine learning application, introducing some new features. Note however, that the new memory features are beneficial also for some DSP applications, such as FFTs.
The AI Engine ML array is composed of three main blocks:
AI Engine ML Tiles, also called compute tiles, that comprehend a 64 kilobyte local memory, a plethora of interconnect resources, various blocks to control the program execution, and a 6-ways Very-Long-Instruction-Word microprocessor equipped with a vector unit capable of performing both fixed and floating point operations.
Fig. 3: Block diagram of the AI Engine ML tile
Interface tiles, whose task is to route the data in the AI Engine from the programmable logic and the programmable Network-on-Chip. Those are a set of interfaces that manages domain crossings, such as Clock Domain Crossing (CDC) between the PL $(f_ { \text{clk}} \sim 500 \text{ MHz})$ and AIE-ML $(f_ {\text{clk}} \ge 1 \text{ MHz})$ environment, and AXI4 compliant multi-channel interconnect to efficiently route the data.
Fig. 4: Block diagram of PL and NoC interface tiles
Memory tiles, that are exclusive of the ML version of the AI Engine, and comprise various interconnect and control resources as well as a 512 kilobyte memory equipped with 6 read and 6 write ports with user programmable access pattern and support for multi-dimensional buffers.
Fig. 5: Block diagram of the memory tile