Multi-Channel Sliding Multiplication - 2024.2 English - UG1603

AI Engine-ML Kernel and Graph Programming Guide (UG1603)

Document ID
UG1603
Release Date
2024-11-28
Version
2024.2 English

The AI Engine API supports a type of multiple channel, multiple lanes multiplication, called multi-channel sliding multiplication. It allows multiple channels to do sliding multiplication simultaneously, and the results are added to an accumulator.

These special multiplication structures or APIs are named aie::sliding_mul_ch_*. They accept coefficient and data inputs. These classes include:

  • aie::sliding_mul_ch_ops
  • aie::sliding_mul_ch_x_ops
  • aie::sliding_mul_ch_y_ops
  • aie::sliding_mul_ch_xy_ops

For more information about these APIs and supported parameters, see the AI Engine API User Guide (UG1529).

The following figure shows how to use the aie::sliding_mul_ch_ops class and its member function, mul, to perform the multi-channel sliding multiplication. It also shows how each channel is performing sliding multiplication on data and coefficient.

Note: The data and coefficient are stored in a channel first style in the registers.
Figure 1. sliding_mul_ch_ops Usage Example

Besides the aie::sliding_mul_ch* classes, AI Engine API provides aie::sliding_mul_ch functions to do multi-channel sliding multiplication and aie::sliding_mac_ch functions to do multi-channel sliding multiplication and accumulation. These functions are simply helpers, that use the aie::sliding_mul_ch*_ops classes internally and are provided for convenience. These include:

  • aie::sliding_mul_ch
  • aie::sliding_mac_ch
The following examples perform multi-channel sliding multiplications (template prototypes are in comments for quick reference).
/*template<unsigned Outputs, unsigned Channels, unsigned Points, int CoeffStep, int DataStepX, int DataStepY, ElemBaseType CoeffType, ElemBaseType DataType, AccumElemBaseType AccumTag = detail::default_accum_tag_t<CoeffType, DataType>>
struct aie::sliding_mul_ch_ops< Outputs, Channels, Points, CoeffStep, DataStepX, DataStepY, CoeffType, DataType, AccumTag >

template<unsigned Outputs, unsigned Channels, unsigned Points, int CoeffStep, int DataStepX, int DataStepY, ElemBaseType CoeffType, ElemBaseType DataType, AccumElemBaseType AccumTag = detail::default_accum_tag_t<CoeffType, DataType>>
template<VectorOrOp VecCoeff, VectorOrOp VecData>
static constexpr accum_type aie::sliding_mul_ch_ops< Outputs, Channels, Points, CoeffStep, DataStepX, DataStepY, CoeffType, DataType, AccumTag >::mul	(	const VecCoeff & 	coeff,
    unsigned 	coeff_start,
    const VecData & 	data,
    unsigned 	data_start 
)	
*/

alignas(aie::vector_decl_align) int8 data[64]={0,1,2,3,1,2,3,4,2,3,4,5,3,4,5,6,4,5,6,7,5,6,7,8,6,7,8,9,7,8,9,10,8,9,10,11,9,10,11,12,10,11,12,13,11,12,13,14,12,13,14,15,13,14,15,16,14,15,16,17,15,16,17,18};
alignas(aie::vector_decl_align) int8 coeff[32]={0,1,2,3,1,2,3,4,2,3,4,5,3,4,5,6,4,5,6,7,5,6,7,8,6,7,8,9,7,8,9,10};
	
aie::vector<int8,32> vcoeff=aie::load_v<32>(coeff);
aie::vector<int8,64> vdata0=aie::load_v<64>(data);

auto acc = aie::sliding_mul_ch_ops<2, 4, 8, 1, 1, 1, int8, int8,acc32>::mul(vcoeff, 0, vdata0, 0);
acc = aie::sliding_mul_ch_ops<2, 4, 8, 1, 1, 1, int8, int8,acc32>::mac(acc,vcoeff, 0, vdata0, 0);

/*template<unsigned Outputs, unsigned Channels, unsigned Points, int CoeffStep = 1, int DataStepX = 1, int DataStepY = DataStepX, AccumElemBaseType AccumTag = accauto, VectorOrOp VecCoeff, VectorOrOp VecData>
auto aie::sliding_mul_ch	(	const VecCoeff & 	coeff,
    unsigned 	coeff_start,
    const VecData & 	data,
    unsigned 	data_start 
)*/
auto acc2 = aie::sliding_mul_ch<2, 4, 8, 1, 1, 1>(vcoeff, 0, vdata0, 0);
Note: All registers in sliding multiplication must be considered circular. They go back to the start after they reach the end.