The two main vector types offered by the AI Engine API are vectors (aie::vector
) and accumulators (aie::accum
).
Vector
A vector represents a collection of elements of the same type which is transparently mapped to the corresponding vector registers supported on AI Engine architectures. Vectors are parametrized by the element type and the number of elements, and any combination that defines a 128b/256b/512b/1024b vector is supported, with 512b being the default.
Vector Types | Sizes 1 |
---|---|
int8 | 16/32/64/128 |
int16 | 8/16/32/64 |
int32 | 4/8/16/32 |
uint8 | 16/32/64/128 |
uint16 | 8/16/32/64 |
uint32 | 4/8/16/32 |
float | 4/8/16/32 |
cint16 | 4/8/16/32 |
cint32 | 2/4/8/16 |
cfloat | 2/4/8/16 |
|
For example, aie::vector<int32,16>
is a 16 element vector of integers with 32
bits. Each element of the vector is referred to as a lane. Using the smallest bit width necessary can improve performance
by making good use of registers.
real
and imag
to extract the real and imagery parts of the data. The real
part is stored in lower space, and the imaginary part is stored in higher space.
For
example:cint8 ctmp={1,2};
printf("real=%d imag=%d\n",ctmp.real,ctmp.imag);//real=1 imag=2
printf("ctmp mem storage=%llx\n",*(long long*)&ctmp);//ctmp mem storage=201
cint32 ctmp2={3,4};
int32 *p_ctmp2=reinterpret_cast<int32*>(&ctmp2);
printf("real=%d imag=%d\n",p_ctmp2[0],p_ctmp2[1]);//real=3 imag=4
aie::vector
and aie::accum
have member functions to do type
casting, data extraction and insertion, and indexing. These operations are
covered in following sections.
Accumulator
An accumulator represents a collection of elements of the same
class, typically obtained as a result of a multiplication operation, which is
transparently mapped to the corresponding accumulator registers supported on each
architecture. Accumulators commonly provide a large number of bits, allowing users
to perform long chains of operations whose intermediate results may exceed the range
of regular vector types. Accumulators are parameterized by the element type, and the
number of elements. The native accumulation bits define the minimum number of bits and the AI Engine API maps different types to the nearest
native accumulator type that supports the requirement. For example, acc40
maps to acc48
for AI Engine architecture.
acc32 | cacc32 | acc40 | cacc40 | acc48 | cacc48 | acc56 | cacc56 | acc64 | cacc64 | acc72 | cacc72 | acc80 | cacc80 | accfloat | caccfloat | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Native accumulation bits | 48 | 80 | 32 |