Versal AI Engine ML overview - 2025.1 English - XD100

Vitis Tutorials: AI Engine Development (XD100)

Document ID
XD100
Release Date
2025-08-25
Version
2025.1 English

The AI Engine (AIE) is a two-dimensional array of computation, memory and interconnect resources connected to programmable logic and to the NoC.

High level block diagram of the AI Engine ML array

Fig. 2: High level block diagram of the AI Engine ML array

Its purpose is to leverage its specialized processors to process heavy computations without requiring timing closure, and to spare programmable logic resources. The AI Engine currently comes in two versions, the AI Engine and the AI Engine - Machine Learning (AIE-ML), that is a new model of the component that has been introduced to support the heavy computation and memory loads needed by machine learning application, introducing some new features. Note however, that the new memory features are beneficial also for some DSP applications, such as FFTs.

The AI Engine ML array is composed of three main blocks:

  • AI Engine ML Tiles, also called compute tiles, that comprehend a 64 kilobyte local memory, a plethora of interconnect resources, various blocks to control the program execution, and a 6-ways Very-Long-Instruction-Word microprocessor equipped with a vector unit capable of performing both fixed and floating point operations. Block diagram of the AI Engine ML tile

    Fig. 3: Block diagram of the AI Engine ML tile

  • Interface tiles, whose task is to route the data in the AI Engine from the programmable logic and the programmable Network-on-Chip. Those are a set of interfaces that manages domain crossings, such as Clock Domain Crossing (CDC) between the PL $(f_ { \text{clk}} \sim 500 \text{ MHz})$ and AIE-ML $(f_ {\text{clk}} \ge 1 \text{ MHz})$ environment, and AXI4 compliant multi-channel interconnect to efficiently route the data. Block diagram of PL and NoC interface tiles

    Fig. 4: Block diagram of PL and NoC interface tiles

  • Memory tiles, that are exclusive of the ML version of the AI Engine, and comprise various interconnect and control resources as well as a 512 kilobyte memory equipped with 6 read and 6 write ports with user programmable access pattern and support for multi-dimensional buffers. Block diagram of the memory tile

    Fig. 5: Block diagram of the memory tile