Memory and DMA Programming - 2024.1 English

AI Engine-ML Kernel and Graph Programming Guide (UG1603)

Document ID
Release Date
2024.1 English

AI Engine-ML can access three types of memory:

AI Engine ML memory
The memory that is at the same tile of the AI Engine-ML processor. And note that besides the data memory in the same tile, AI Engine-ML processor can also access those in its north, south and west neighbors.
Memory Tile
One or two rows of memories in the AI Engine-ML array accessible from the AXI4-Stream network.
External Memory
The High Bandwidth Memory (HBM) on the device or DDR memory external to the device.

All of these memories can be addressed linearly within address ranges. They can also be addressed in more complex ways using multidimensional addressing. These memories are managed automatically by DMAs (Direct Memory Access). These DMAs support a variety of features that can be programmed to aid in memory accesses:

Table 1. DMA Features
DMA Feature Description AI Engine-ML Tile DMA 1 Memory Tile DMA Interface Tile DMA 2
Maximum Addressing Dimension Maximum dimension for accessing data available to the DMA depending on the type of memory. Depending on the application, you can access data in a uni-dimensional (1D), bi-dimensional (2D, as in a gray-scale image), three-dimensional (3D, as in a multichannel image) and, four-dimensional (4D, as in multiple multichannel images). 3D3 4D 3D
Zero-padding Feature that enables read-access outside the buffer and generates zeros on the data stream. N/A Supported N/A
Packet-ID Feature that is used by AXI4-Stream switch to drive packets to their destination based on the Packet ID. See Explicit Packet Switching. Supported Supported Supported
Number of Buffer Descriptors Lists the total number of buffer descriptors available in the memory. A Buffer Descriptor describes a DMA transfer. Each buffer descriptor contains all information needed for a DMA transfer. They are used by the DMA to specify the read/write access schemes to the memory. 16 48 16
Number of Semaphore Locks List the total number of semaphore locks available in the memory. These locks are used by the DMA to handle synchronization for Buffer Descriptor usage. They are used to organize read and write access to the memory, including complex many-to-many accesses. 16 64 16
  1. Used to access local buffers.
  2. Used to access HBM or DDR memory.
  3. No API in the current version of the tools, but the hardware supports 3D addressing.

The AI Engine tiling parameters enable you to program the DMAs for the Memory Tile and the Interface Tile. They can be declared and defined in the ADF graph and connected to the appropriate kernel inputs or outputs. For more information on tiling parameters, see Tiling Parameters and Buffer Descriptors.