Tiling Parameters and Buffer Descriptors - 2024.2 English - UG1603

AI Engine-ML Kernel and Graph Programming Guide (UG1603)

Document ID
UG1603
Release Date
2024-11-28
Version
2024.2 English

DMA transfers are managed by buffer descriptors located in all thee levels of memory:

  • AI Engine-ML memory with co-located Buffer descriptors.
  • Memory tile with co-located Buffer Descriptors.
  • External memory with buffer descriptors located in the AI Engine-ML-PL interface.

These buffer descriptors handle 3D or 4D memory addressing, multiple iterations, lock ID and value change and buffer descriptor chaining.

Buffer descriptors address generation can be complex and hard to maintain over time. Hence, higher-level address generation is supported by the AI Engine compiler when you create and configure tiling parameters objects in the graph. When data is transferred, it is based on the configuration settings of the tiling parameter object associated with the memory. Data transfers occur on a tile basis, that can be as small as a 1x1 element, that are regularly extracted from or written to a memory space. The structure tiling_parameters is defined as follows:
struct tiling_parameters
{
  std::vector<uint32_t> buffer_dimension;
  std::vector<uint32_t> tiling_dimension;
  std::vector<int32_t> offset;
  std::vector<traversing_parameters> tile_traversal;
  int packet_port_id = -1;
  std::vector<uint32_t> boundary_dimension;
};

The members of this structure are:

buffer_dimension
Buffer dimensions in the memory element type (e.g., AI Engine-ML memory, memory tile, external memory). buffer_dimension[0] is contiguous in memory and has the fastest access. When this member is not specified, the dimensions of the associated memory object will be used. The AI Engine-ML memory can access data in the 1st, 2nd, and 3rd dimensions. Memory tile can access data in the 1st, 2nd, 3rd, and 4th dimensions. External memory can access data in the 1st, 2nd, and 3rd dimensions.
tiling_dimension
Tiling dimensions of the data transfer in buffer. The tiling dimension of AI Engine-ML memory can access data in the 1st, 2nd, and 3rd dimensions. Memory tile can access data in the 1st, 2nd, 3rd, and 4th dimensions. External memory can access data in the 1st, 2nd, and 3rd dimensions. If tiling dimensions exceed buffer dimensions, DMA transfers will try to do zero padding.
Note: Zero padding is only supported on a memory tile. Only dimensions 0, 1 and 2 support zero padding. The maximum zero number padding before or after is 64 words for dimension 0, 32 words for dimension 1, and 16 words for dimension 2.
offset
Multidimensional offset with respect to the starting element in the buffer, assuming the buffer dimension is specified.
tile_traversal
Vector of traversing_parameters. tile_traversal[i] represents the i-th loop of inter-tile traversal, where i=0 represents most inner loop and i=N-1 represents most outer loop. tile_traversal structure is detailed the section below.
packet_port_id
Multiple connections can go through a single port that are previously merged through a pktmerge block or split afterward with a pktsplit block. This member represents the output port ID of the connected pktsplit or the input port ID of the connected pktmerge. If this member is set to a specific id, the data transfer will only happen if the incoming or outgoing data block ID matches this ID.
boundary_dimension
Real data boundary dimension for padding.

A key member of the tiling parameter is the tile_traversal vector that describes how the buffer will be accessed. The structure traversing_parameters is defined as follows:

struct traversing_parameters
{
  uint32_t dimension;
  uint32_t stride;
  uint32_t wrap;
}; 

The members of this structure are:

dimension
The buffer dimension on which this traversing loop applies. Depending on the type of memory element, it could be the 0, 1st, 2nd or 3rd dimension. The stride and wrap members of this structure are applied in the dimension specified.
stride
Represents the distance in terms of buffer element data type between consecutive inter-tile traversal in this dimension.
When converting Tiling parameters into Buffer Descriptor parameters there might be overflow in the AI Engine Compiler. The stride is specified in the tiling parameters represents the number of samples counted in the associated dimension. However, the Buffer Dimension stride is the number of samples counted in the linear addressing space, so the Tiling Parameter stride is multiplied by the product of the lengths of the smaller dimensions. For example, consider a 4D buffer space in a memory tile: { 32, 32, 32, 16 }. If the traversing parameter contains a stride of 8 on dimension 3, it means that the translated buffer descriptor stride will be 8*32*32*32 = 256K (2^18) which cannot be encoded in the 17 bits dedicated to stride in the Buffer register of the Memory Tile.
The bitwidth of the register fields for the stepsize and wrap for the memory module, memory tile, and interface tile are:
Table 1. Register Bitwidth
Module/Tile Step Size Wrap
Memory Module 13 8
Memory Tile 17 10
Interface Tile 20 10
wrap
Number of tiles to access in this dimension.

MM2S ports are responsible for zero-padding. If the offset contains negative values, the DMA will automatically add zero-padding to the stream to ensure the correct number of data is sent. The same behavior occurs when the tile_traversal parameters go beyond the end of the buffer.

When the stride value is lower than the tile size in one or more dimensions, the tiles will overlap naturally in that dimension.
Important: All generated addresses are 32-bit aligned. The requirements for accessing data that is less than or equal to 16 bits are:
  • 16-bit data are accessed as pairs.
  • 8-bit data are accessed as fours.
  • 4-bit data are accessed as eights.