Tiling Parameters and Buffer Descriptors - 2024.1 English

AI Engine-ML Kernel and Graph Programming Guide (UG1603)

Document ID
Release Date
2024.1 English
DMA transfers are managed by buffer descriptors located in all thee levels of memory:
  • AI Engine-ML memory with co-located Buffer descriptors.
  • Memory tile with co-located Buffer Descriptors.
  • External memory with buffer descriptors located in the AI Engine-ML-PL interface.
These buffer descriptors handle 3D or 4D memory addressing, multiple iterations, lock ID and value change and buffer descriptor chaining.
Buffer descriptors address generation can be complex and hard to maintain over time. Hence, higher-level address generation is supported by the AI Engine compiler when you create and configure tiling parameters objects in the graph. When data is transferred, it is based on the configuration settings of the tiling parameter object associated with the memory. Data transfers occur on a tile basis, that can be as small as a 1x1 element, that are regularly extracted from or written to a memory space. The structure tiling_parameters is defined as follows:
struct tiling_parameters
  std::vector<uint32_t> buffer_dimension;
  std::vector<uint32_t> tiling_dimension;
  std::vector<int32_t> offset;
  std::vector<traversing_parameters> tile_traversal;
  int packet_port_id = -1;
  std::vector<uint32_t> boundary_dimension;

The members of this structure are:

Buffer dimensions in the memory element type (e.g., AI Engine-ML memory, memory tile, external memory). buffer_dimension[0] is contiguous in memory and has the fastest access. When this member is not specified, the dimensions of the associated memory object will be used. The AI Engine-ML memory can access data in the 1st, 2nd, and 3rd dimensions. Memory tile can access data in the 1st, 2nd, 3rd, and 4th dimensions. External memory can access data in the 1st, 2nd, and 3rd dimensions.
Tiling dimensions of the data transfer in buffer. The tiling dimension of AI Engine-ML memory can access data in the 1st, 2nd, and 3rd dimensions. Memory tile can access data in the1st, 2nd, 3rd, and 4th dimensions. External memory can access data in the 1st, 2nd, and 3rd dimensions. If tiling dimensions exceed buffer dimensions, DMA transfers will try to do zero padding.
Note: Zero padding is only supported on a memory tile. Only dimensions 0, 1 and 2 support zero padding. The maximum zero number padding before or after is 64 words for dimension 0, 32 words for dimension 1, and 16 words for dimension 2.
Multidimensional offset with respect to the starting element in the buffer, assuming the buffer dimension is specified.
Vector of traversing_parameters. tile_traversal[i] represents the i-th loop of inter-tile traversal, where i=0 represents most inner loop and i=N-1 represents most outer loop. tile_traversal structure is detailed the section below.
Multiple connections can go through a single port that are previously merged through a pktmerge block or split afterward with a pktsplit block. This member represents the output port ID of the connected pktsplit or the input port ID of the connected pktmerge. If this member is set to a specific id, the data transfer will only happen if the incoming or outgoing data block ID matches this ID.
Real data boundary dimension for padding.
A key member of the tiling parameter is the tile_traversal vector that describes how the buffer will be accessed. The structure traversing_parameters is defined as follows:
struct traversing_parameters
  uint32_t dimension;
  uint32_t stride;
  uint32_t wrap;

The members of this structure are:

The buffer dimension on which this traversing loop applies. Depending on the type of memory element, it could be the 0, 1st, 2nd or 3rd dimension. The stride and wrap members of this structure are applied in the dimension specified.
Represents the distance in terms of buffer element data type between consecutive inter-tile traversal in this dimension.
Number of tiles to access in this dimension.

MM2S ports handle zero-padding. When the offset contains negative values, the DMA will automatically zero-pad the stream in order to send the right number of data. The same behavior occurs when the tile_traversal parameters run over the end of the buffer.

If in one or multiple dimensions the stride value is less than the tile size, the tiles will naturally overlap in that dimension.
Important: All generated addresses are 32-bit aligned. The requirements for accessing data that is less than or equal to 16 bits are:
  • 16-bit data are accessed as pairs.
  • 8-bit data are accessed are fours.
  • 4-bit data are accessed are eights.