The AI Engine API supports basic arithmetic operations on two vectors, or on a scalar and a vector (operation on the scalar and each element of the vector). It also supports addition or subtraction of a scalar or a vector on an accumulator. Additionally, it supports multiply-accumulate (MAC). These operations include:
-
aie::mul - Returns an accumulator with the element-wise multiplication of two vectors, or the product of a vector and a scalar value.
-
aie::negmul - Returns an accumulator with the negative of the element-wise multiplication of two vectors, or the negative of the product of a vector and a scalar value.
-
aie::mac - Multiply-add on vectors (or scalar) and accumulator.
-
aie::msc - Multiply-sub on vectors (or scalar) and accumulator.
-
aie::add - Returns a vector with the element-wise addition of two vectors, or adds a scalar value to each component of a vector, or adds scalar or vector on accumulator.
-
aie::sub - Returns a vector with the element-wise subtraction of two vectors, or subtracts a scalar value from each component of a vector. Or subtract scalar or vector on accumulator.
-
aie::saturating_add - Returns a vector with the element-wise addition of two vectors, or adds a scalar value to each component of a vector. It supports saturation mode.
-
aie::saturating_sub - Returns a vector with the element-wise subtraction of two vectors, or subtract a scalar value from each component of a vector. It supports saturation mode.
The vectors and accumulator must have the same size and their types must be compatible. For example:
aie::vector<int32,8> va,vb;
aie::accum<acc64,8> vm=aie::mul(va,vb);
aie::accum<acc64,8> vm2=aie::mul((int32)10,vb);
aie::vector<int32,8> vsub=aie::sub(va,vb);
aie::vector<int32,8> vadd=aie::add(va,vb);
// vsub2[i]=va[i]-10
aie::vector<int32,8> vsub2=aie::sub(va,(int32)10);
// vsub2[i]=10+va[i]
aie::vector<int32,8> vadd2=aie::add((int32)10,va);
aie::accum<acc64,8> vsub_acc=aie::sub(vm,(int32)10);
aie::accum<acc64,8> vsub_acc2=aie::sub(vm,va);
aie::accum<acc64,8> vadd_acc=aie::add(vm,(int32)10);
aie::accum<acc64,8> vadd_acc2=aie::add(vm,vb);
aie::accum<acc64,8> vmac=aie::mac(vm,va,vb);
aie::accum<acc64,8> vmsc=aie::msc(vm,va,vb);
// scalar and vector can switch placement
aie::accum<acc64,8> vmac2=aie::mac(vm,va,(int32)10);
// scalar and vector can switch placement
aie::accum<acc64,8> vmsc2=aie::msc(vm,(int32)10,vb);
--aie.stacksize=<size (in bytes)> from AI Engine Options in the AI
Engine Tools and Flows User Guide (UG1076).aie::add and aie::saturating_add on
vector addition when saturation happens.
aie::set_saturation(aie::saturation_mode::saturate);
aie::vector<int16, 16> v1 = aie::broadcast<int16, 16>(20000);
aie::vector<int16, 16> v2 = aie::broadcast<int16, 16>(20000);
aie::vector<int16, 16> result1 = aie::add(v1, v2);
printf("vector + vector = %d\n", result1.get(0));
//output: vector + vector = -25536
aie::vector<int16, 16> result_sat = aie::saturating_add(v1, v2);
printf("vector + vector saturate= %d\n", result_sat.get(0));
//output: vector + vector saturate= 32767
The AI Engine API supports arithmetic operations on a vector or accumulation of element-wise square, including:
-
aie::abs - Computes the absolute value for each element in the given vector.
-
aie::abs_square - Computes the absolute square of each element in the given complex vector.
-
aie::conj - Computes the conjugate for each element in the given vector of complex elements.
-
aie::neg - For vectors with signed types, returns a vector whose elements are the same as in the given vector but with the sign flipped. If the input type is unsigned, the input vector is returned.
-
aie::mul_square - Returns an accumulator of the requested type with the element-wise square of the input vector.
-
aie::mac_square - Returns an accumulator with the addition of the given accumulator and the element-wise square of the input vector.
-
aie::msc_square - Returns an accumulator with the subtraction of the given accumulator and the element-wise square of the input vector.
The vector and the accumulator must have the same size and their types must be compatible. For example:
aie::vector<int16,16> va;
aie::vector<cint16,16> ca;
aie::vector<int16,16> va_abs=aie::abs(va);
aie::vector<int32,16> ca_abs=aie::abs_square(ca);
aie::vector<cint16,16> ca_conj=aie::conj(ca);
aie::vector<int16,16> va_neg=aie::neg(va);
aie::accum<acc32,16> va_sq=aie::mul_square(va);
aie::vector<int32,8> vc,vd;
aie::accum<acc64,8> vm=aie::mul(vc,vd);
// vmac3[i]=vm[i]+vc[i]*vc[i];
aie::accum<acc64,8> vmac3=aie::mac_square(vm,vc);
// vmsc3[i]=vm[i]-vd[i]*vd[i];
aie::accum<acc64,8> vmsc3=aie::msc_square(vm,vd);
Operands can also be supported pre-multiplication operations. On some AIE-ML / AIE-ML v2 architectures certain operations can be collapsed with the multiplication into a single instruction. For example:
aie::vector<cint16,16> ca,cb;
aie::accum<cacc48,16> acc=aie::mul(aie::op_conj(ca),aie::op_conj(cb));
The AI Engine API
supports operations natively or through emulation on different data types. Those
emulated operations can impact the theoretical performance. For example, the MAC
operations of int32 by int32 or cint32 by cint32 are emulated. For more details about emulation
on operations, see the
AI
Engine API User Guide (UG1529).
| Accumulator Type | Maximum Number of Lanes |
|---|---|
| acc32 | 64 |
| cacc32 | 32 |
| acc64 | 32 |
| cacc64 | 16 |