MAC on 32x32 bits - 2022.1 English

AI Engine Kernel Coding Best Practices Guide (UG1079)

Document ID
Release Date
2022.1 English

The following figure shows how start, offsets, and step work on the cint16 data type.

Figure 1. MAC4 on cint16 x cint16 Type

mac4 has four output lanes. The first column of data is selected by adding xstart to every 4 bits of xoffsets. The subsequent column of data is selected by adding xstep to its previous column. In Table 1, it is seen that there are eight MACs per cycle for the cint16 * cint16 operation. This means that mac4 has two columns of multiplication.

The coefficients of mac4 are chosen similarly by zstart, zoffset, and zstep.