The AI Engine APIs support linear addressing in AI Engine kernels. The addresses can be adjusted by using a pointer or by using arithmetic operations on iterators.
Multi-dimensional addressing is supported in AIE-ML and AIE-ML v2 devices. An aie::tensor_descriptor object
is used to map a multidimensional tensor to a 1-D memory space. It is created from a
vector base type and a list of aie::tensor_dim pairs formed from
the tensor size (the number of vectors) in each dimension, and a step parameter
indicating how the next component of the tensor is obtained. This representation
allows reiteration over a sub-volume of the tensor by adding an extra
aie::tensor_dim pair with a step of zero and the size set to
the desired number of iterations.
The AI Engine API introduces tensor buffer
streams to support multi-dimensional addressing inside a kernel. The tensor buffer
streams are created using aie::make_tensor_buffer_stream, and can
be advanced by the >> operator or the pop() member
function.
aie::make_tensor_buffer_stream accepts two parameters: a pointer to
the raw data and a tensor descriptor, which describes how data is transferred to the
stream. Each time the stream advances, it reads and returns a vector of the base
type. The tensor descriptors and associated buffer streams can be composed to
arbitrary dimensions, although the underlying mechanisms are built on
three-dimensional abstractions. The tensor buffer streams are recursively defined,
decomposing an N-dimensional tensor into (N-1)/3 nested streams,
with a final N % 3 leaf stream. Accessing an inner stream requires
reading the containing outer stream with a pop() call, which
advances the outer stream and returns the inner stream. For example, a tensor
descriptor and a tensor buffer stream are created as follows:
#include "aie_api/aie.hpp"
#include "aie_api/aie_adf.hpp"
#include "aie_api/utils.hpp"
constexpr unsigned vlen = 4; // vector length of base type
constexpr unsigned tsize = 8; // no. of tensor components for this dimension
constexpr unsigned tstep = 2; // no. of tensor components to step over
constexpr unsigned N = vlen * tsize * tstep; // size of data buffer
using dtype = int32; // data type of base type and buffer
void tbuff() {
// declare data buffer
alignas(aie::vector_decl_align) dtype buff[N];
// initialize buffer contents
for (unsigned i = 0u; i < N; i++) {
buff[i] = i;
}
// tensor descriptor with base type: <dtype, vlen>
// tensor dimensions: (tsize, tstep)
auto desc = aie::make_tensor_descriptor<dtype, vlen>(aie::tensor_dim(tsize, tstep));
// create tensor buffer stream associating buff with desc
auto tbs = aie::make_tensor_buffer_stream(buff, desc);
// show the contents of the tensor buffer stream
aie::vector<dtype, vlen> v; // vector same as base type
for (unsigned i = 0u; i < N/(vlen * tstep); i++) {
v = tbs.pop(); // "tbs >> v" may also be used
printf("i = %d:\n ", i);
aie::print(v, true, "v = ");
}
} // end tbuff()
The preceding example code shows that the base type of the stream is
aie::vector<int32, 4>. The addressing is specified as
aie::tensor_dim(8, 2). This implies that there are eight
vectors of the base type for this dimension, and the step value of 2 specifies that
vectors with even indices are selected. The starting point for the step is the
beginning of the associated data buffer.
The addressing of the data buffer goes from the lower to the higher dimensions of
aie::tensor_dim. The first pair denotes the lowest
dimension.
aie::make_tensor_desciptor.Dimension 0
Addressing starts from the first component in the data buffer (at index = 0), and advances by the step value (step0). After it advances by the specified size, addressing moves to the next dimension, when available.
For the above code sample, the data buffer contents can be visualized as follows:
index buffer contents
* 0 : 0, 1, 2, 3,
1 : 4, 5, 6, 7,
* 2 : 8, 9, 10, 11,
3 : 12, 13, 14, 15,
* 4 : 16, 17, 18, 19,
5 : 20, 21, 22, 23,
* 6 : 24, 25, 26, 27,
7 : 28, 29, 30, 31,
* 8 : 32, 33, 34, 35,
9 : 36, 37, 38, 39,
* 10 : 40, 41, 42, 43,
11 : 44, 45, 46, 47,
* 12 : 48, 49, 50, 51,
13 : 52, 53, 54, 55,
* 14 : 56, 57, 58, 59,
15 : 60, 61, 62, 63
The buffer contents are shown with four columns because the number of elements in the
base type is vlen = 4.
Running the previous code block shows the following result.
i = 0:
v = 0 1 2 3
i = 1:
v = 8 9 10 11
i = 2:
v = 16 17 18 19
i = 3:
v = 24 25 26 27
i = 4:
v = 32 33 34 35
i = 5:
v = 40 41 42 43
i = 6:
v = 48 49 50 51
i = 7:
v = 56 57 58 59
Dimension 1
The step specified for dimension 1 (step1) is the distance from index 0. Thus, after the length specified for dimension 0 is obtained, the first component for dimension 1 is the base vector at the index defined by step1.
Steps within a dimension are inherited from step0.
Dimension 2
The step specified for dimension 2 (step2) is the distance from index 0. Thus, after the length specified for dimension 1 is obtained, the first component for dimension 2 is the base vector at the index defined by step2.
Steps within a dimension are inherited from step0.
Reiterating Over a Sub-Volume
If the buffer needs to be accessed as (0, 1 ,2 , 3) (4x), (4, 5, 6, 7) (4x), and so on until (60, 61 ,62 ,63) (4x), the following code fragment accomplishes this.
#include "aie_api/aie.hpp"
#include "aie_api/aie_adf.hpp"
#include "aie_api/utils.hpp"
constexpr unsigned vlen = 4; // vector length of base type
constexpr unsigned nvec = 16; // no. of base vectors in buffer
constexpr unsigned N = vlen * nvec; // size of data buffer
using dtype = int32; // data type of base type and data buffer
void tbuff_subvol() {
// declare data buffer
alignas(aie::vector_decl_align) dtype buff[N];
// initialize buffer contents
for (unsigned i = 0u; i < N; i++) {
buff[i] = i;
}
// tensor descriptor with base type: <dtype, vlen>
// tensor dimensions: (tsize, tstep)
auto desc = aie::make_tensor_descriptor<dtype, vlen>(aie::tensor_dim(4, 1), // 1st set
aie::tensor_dim(4, 0) // repeat 4x
);
// create tensor buffer stream associating buff with desc
auto tbs = aie::make_tensor_buffer_stream(buff, desc);
// show the contents of the tensor buffer stream at each step increment
for (unsigned i = 0u; i < N; i++) {
aie::vector<dtype, vlen> v = tbs.pop();
printf("i = %d:\n ", i);
aie::print(v, true, "v = ");
}
} // end tbuff()
Running this produces the following result.
i = 0:
v = 0 1 2 3
i = 1:
v = 0 1 2 3
i = 2:
v = 0 1 2 3
i = 3:
v = 0 1 2 3
i = 4:
v = 4 5 6 7
i = 5:
v = 4 5 6 7
i = 6:
v = 4 5 6 7
i = 7:
v = 4 5 6 7
8< --- snip --- >8
i = 56:
v = 56 57 58 59
i = 57:
v = 56 57 58 59
i = 58:
v = 56 57 58 59
i = 59:
v = 56 57 58 59
i = 60:
v = 60 61 62 63
i = 61:
v = 60 61 62 63
i = 62:
v = 60 61 62 63
i = 63:
v = 60 61 62 63