The Versal AI Core Series delivers breakthrough AI inference acceleration with AI Engines. This series is designed for a breadth of applications, including cloud for dynamic workloads and network for massive bandwidth, all while delivering advanced safety and security features. AI and data scientists, as well as software and hardware developers, can all take advantage of the high compute density to accelerate the performance of any application. Given the AI Engine's advanced signal processing compute capability, it is well-suited for highly optimized wireless applications such as radio, 5G, backhaul, and other high-performance DSP applications.
AI Engines are an array of very-long instruction word (VLIW) processors with single instruction multiple data (SIMD) vector units that are highly optimized for compute-intensive applications, specifically digital signal processing (DSP), 5G wireless applications, and artificial intelligence (AI) technology such as machine learning (ML).
AI Engines are hardened blocks that provide multiple levels of parallelism including instruction-level and data-level parallelism. Instruction-level parallelism includes a scalar operation, up to two moves, two vector reads (loads), one vector write (store), and one vector instruction that can be executed—in total, a 7-way VLIW instruction per clock cycle. Data-level parallelism is achieved via vector-level operations where multiple sets of data can be operated on a per-clock-cycle basis. Each AI Engine contains both a vector and scalar processor, dedicated program memory, local data memory, and can access adjacent local memory in any of three neighboring directions. It also has access to DMA engines and AXI4 interconnect switches to communicate via streams to other AI Engines or to the programmable logic (PL) or the DMA. Refer to the Versal Adaptive SoC AI Engine Architecture Manual (AM009) for specific details on the AI Engine array and interfaces.
The AI Engine-ML (AIE-ML) block is capable of delivering 2x compute throughput compared to its predecessor AI Engine blocks. The AIE-ML block, primarily targeted for machine learning inference applications, delivers one of the industry's best performance per Watt for a wide range of inference applications. Refer to the Versal Adaptive SoC AIE-ML Architecture Manual (AM020) for specific details on the AIE-ML features and architecture.
As an application developer, it is possible to use one of the white box or black box flows for running a ML inference application on AIE-ML. With the white box flow you can integrate custom kernels and dataflow graphs in the AIE-ML programming environment. A black box flow uses performance optimized Neural Processing Unit (NPU) IP from AMD to accelerate ML workloads in the AIE-ML block.
AMD Vitis™ AI is used as a front-end tool that parses the network graph, performs optimization, quantization of the graph, and generates compiled code that can be run on the AIE-ML hardware. The AIE-ML core tile architecture provides support for a variety of precision fixed and floating-point datatypes. The architecture allows for pipe-lined vector processing and incorporates high-density, high-speed on-chip memory that can effectively store on-chip tensors. Additionally, it features versatile datamovers that are adept at handling multi-dimensional tensors in memory. With the proper selection of overlay processor architecture and spatial and temporal distribution of the input/output tensor in the on/off-chip memory, it is possible to achieve high computational efficiency of the AIE-ML processing cores.
AMD provides device drivers and libraries for user applications to access the AMD Versal™ AI Engines. See AI Engine Tools and Flows User Guide (UG1076) to learn how to use the AI Engine program environment. The Vitis IDE lets you compile, simulate, and debug the different elements of a Versal device AI Engine application. For detailed information on the Vitis IDE tool flow, refer to Versal Adaptive SoC Design Guide (UG1273). For detailed information on the AI Engine tools and flows, refer to AI Engine Tools and Flows User Guide (UG1076).
The following sections describe the software stack of the AI Engine.