The AI Engine API supports a type of multiple lanes multiplication, called sliding multiplication. It allows multiple lanes to do MAC operations simultaneously, and the results are added to an accumulator.
These special multiplication structures or APIs are named aie::sliding_mul*
. They accept coefficient and data
inputs. These classes include:
-
aie::sliding_mul_ops
-
aie::sliding_mul_x_ops
-
aie::sliding_mul_y_ops
-
aie::sliding_mul_xy_ops
For more information about these APIs and supported parameters, see the AI Engine API User Guide (UG1529).
For example, the aie::sliding_mul_ops
class
provides a parameterized multiplication that implements the following compute
pattern.
DSX = DataStepX
DSY = DataStepY
CS = CoeffStep
P = Points
L = Lanes
c_s = coeff_start
d_s = data_start
out[0] = coeff[c_s] * data[d_s + 0] + coeff[c_s + CS] * data[d_s + DSX] + ... + coeff[c_s + (P-1) * CS] * data[d_s + (P-1) * DSX]
out[1] = coeff[c_s] * data[d_s + DSY] + coeff[c_s + CS] * data[d_s + DSY + DSX] + ... + coeff[c_s + (P-1) * CS] * data[d_s + DSY + (P-1) * DSX]
...
out[L-1] = coeff[c_s] * data[d_s + (L-1) * DSY] + coeff[c_s + CS] * data[d_s + (L-1) * DSY + DSX] + ... + coeff[c_s + (P-1) * CS] * data[d_s + (L-1) * DSY + (P-1) * DSX]
Parameter | Description |
---|---|
Lanes | Number of output elements. |
Points | Number of data elements used to compute each lane. |
CoeffStep | Step used to select elements from the coeff register. This step is applied to element selection within a lane. |
DataStepX | Step used to select elements from the data register. This step is applied to element selection within a lane. |
DataStepY | Step used to select elements from the data register. This step is applied to element selection across lanes. |
CoeffType | Coefficient element type. |
DataType | Data element type. |
AccumTag | Accumulator tag that specifies the required accumulation bits. The class must be compatible with the result of the multiplication of the coefficient and data types (real/complex). |
The following figure shows how to use the aie::sliding_mul_ops
class and its member function, mul
, to perform the sliding multiplication. It also
shows how each parameter corresponds to the multiplication.
Besides the aie::sliding_mul*
classes, AI Engine API provides aie::sliding_mul*
functions to do sliding
multiplication and aie::sliding_mac*
functions to
do sliding multiplication and accumulation. These functions are simply helpers, that
use the aie::sliding_mul*_ops
classes internally
and are provided for convenience. These include:
-
aie::sliding_mul
-
aie::sliding_mac
aie::vector<int16,16> va;
aie::vector<int16,64> vb0,vb1;
aie::accum<acc48,8> acc = aie::sliding_mul_ops<8, 8, 1, 1, 1, int16, int16, acc48>::mul(va, 0, vb0, 0);
acc = aie::sliding_mul_ops<8, 8, 1, 1, 1, int16, int16, acc48>::mac(acc, va, 8, vb1, 0);
*outIter++=acc.to_vector<int32>(15);
/*template<unsigned Lanes, unsigned Points, int CoeffStep = 1, int DataStepX = 1, int DataStepY = DataStepX, AccumElemBaseType AccumTag = accauto, VectorOrOp VecCoeff = void, VectorOrOp VecData = void>
auto sliding_mul (const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned data_start)
*/
aie::vector<cint16,32> data_buff;
aie::vector<cint16,8> coeff_buff;
aie::accum<cacc48,8> acc_buff = aie::sliding_mul<8, 8>(coeff_buff, 0, data_buff, 0);
Considerations When Using sliding_mul
The current restriction is:
- Data width <=1024 bits, and Coefficient width <=512bits