Multi-dimensional Addressing in AI Engine Kernels - 2024.1 English

AI Engine-ML Kernel and Graph Programming Guide (UG1603)

Document ID
UG1603
Release Date
2024-06-06
Version
2024.1 English

The AI Engine APIs support linear addressing in AI Engine kernels. The addresses can be adjusted using a pointer or using iterator arithmetic operations.

Multi-dimensional addressing is supported in AI Engine-ML devices. An aie::tensor_descriptor object is used to map a multidimensional tensor to a 1-D memory space. It is created from an element type, the number of elements that form a vector block within the tensor, and a list of aie::tensor_dim objects, which describe each dimension of the tensor using size-step pairs.

For instance, a 3-D volume can be represented with an element type of int8 and 32 elements per block, resulting in segments of the tensor being aie::vector<int8, 32>. The size of each dimension is provided as the first parameter of each aie::tensor_dim, while the increment required to take a step in each dimension is given as the second parameter. This representation allows iteration over a sub-volume of the tensor by adding an extra aie::tensor_dim with a step set to zero and the size set to the desired number of iterations.

The AI Engine API introduces tensor buffer streams to support multi-dimensional addressing inside a kernel. The tensor buffer streams are created using aie::make_tensor_buffer_stream, and can be advanced by operator >> or member function pop().

aie::make_tensor_buffer_stream accepts parameters, including a pointer to real data and a tensor descriptor, which describes how the stream is advanced. Each time the stream advances, it reads and returns a vector. The tensor descriptors and associated buffer streams can be composed to arbitrary dimensions, although the underlying mechanisms are built on three-dimensional abstractions. To address this, the tensor buffer streams are recursively defined, decomposing an N-dimensional tensor into (N-1)/3 nested streams, with a final N%3 leaf stream. Accessing an inner stream requires reading the containing outer stream with a .pop() call, which advances the outer stream and returns the inner stream. For example, a tensor descriptor and a tensor buffer stream are created as follows:

alignas(aie::vector_decl_align) static int16 dataA[N];
std::iota(dataA, dataA + N, 0);

// Create a tensor descriptor which has 8 unsigned elements with step of 2 dimension 0
auto desc = aie::make_tensor_descriptor<int16,16>(aie::tensor_dim(8u,2)); 

//Create a tensor buffer stream where "dataA" is associated with the tensor descriptor
auto tbs = aie::make_tensor_buffer_stream(dataA, desc);
 
aie::vector<int16, 16> v;
tbs >> v; 
//Alternatively pop() can be used 
v=tbs.pop();

The example code above shows that the base element of the stream is aie::vector<int16,16>. The addressing is specified with one or more aie::tensor_dim. aie::tensor_dim specifies the size and the step for the dimension. The size and step starts are calculated from the beginning of the tensor buffer, and each element of the tensor buffer is a vector specified in the tensor descriptor.

The addressing of the tensor buffer stream is from the lower dimensions to the higher dimensions.

  • Dimension 0: For dimension 0, the addressing starts from element 0, and advances by the step value of dimension 0. After it has advanced the specified size of dimension 0, the addressing wraps to the next dimension.
    Figure 1. Addressing for Dimension 0
  • Dimension 1: The dimension 1 is similar to dimension 0. It selects data by applying step values of dimension 0 and then wraps to dimension 1 and selects data per the step value of dimension 0.

    In the example below, four int16 values are selected by applying a step value of 2 which is 0,2,4,6

    The function then increments by the size of dimension 1, and repeats the data selection for the next row of data 8,10,12,14

    Figure 2. Addressing for Dimension 1
  • Dimension 2: Similarly, for dimension 2, it repeats the mode of dimension 1, and advances step values of dimension 0.
    Figure 3. Addressing for Dimension 2

The tensor buffer streams have some variants. For more information on tensor descriptors and buffers, see Memory in the AI Engine API User Guide (UG1529).