AIE-ML Processor

Versal Adaptive SoC AIE-ML Architecture Manual (AM020)

Document ID

AM020

Release Date

2023-11-10

Revision

1.2 English

Similar to AIE, the AI Engine processor in AIE-ML consists of a scalar 32-bit data path, a SIMD vector data path, two load units, and a store unit, and is optimized for ML applications.

The following provides a list of AIE-ML processor features:

Instruction-based VLIW SIMD processor with new instructions
Same 16 KB program memory as in AIE
Vector unit supports 256 (8b x 8b) and 512 (4b x 8b) MAC operations
Vector unit supports 128 bloat16 MAC operations with FP32 accumulation
Vector unit supports structure sparsity and FFT processing for ML inference applications, including cint32 x cint16 multiplication (data in cint32 and twiddle factor is cint16), control support for complex and conjugation, new permute mode, and shuffle mode. See Sparsity for more information.
A new processor bus that allows the processor to access memory mapped registers in the local AIE-ML tile
The complex circular addressing modes are dropped and replaced by a 3D addressing mode
On-the-fly decompression during loading of sparse weights. See Sparsity for more information.

The AIE-ML processor removes some advanced DSP functionality used in the AIE processor including:

32-bit floating-point vector data path is not directly supported but can be emulated via decomposition into multiple multiplications of 16 x 16-bit
Scalar non-linear functions, including sin/cos, sqrt, inverse sqrt and inverse
Scalar floating point/integer conversions
Complex circular addressing and FFT addressing modes. However, some level of FFT and complex support is provided; see the AIE-ML processor features.
Limited support 128-bit load/store
Non-aligned memory access
Support for some complex data-types; some level of complex support is provided, see the AIE-ML processor features
Native support for 32 × 32 multiplication but can be emulated using 16-bit integer operands
Removal of non-blocking 128-bit stream interfaces and stream FIFOs
Control streams and packet header generations