An example of MAC with int16 X
buffer and
int16 Z
buffer is as follows. Note that the permute
granularity for X
buffer is 32 bits. The start
and step
parameters are always in terms of data type granularity. Therefore, a value of 2 for
16 bits data will choose 2 * 16 bits away. The xoffsets
parameter comes as a pair. The first hex value is an absolute
32 bits offset and picks up 2 x 16 bits values (index, index+1) in the even row. The
second hex value is offset from first value + 1 (32 bits offset) and picks up 2 x 16
bits values in the odd row. So the hex value 0x24
in xoffsets
selects index 8, 9 for even row and
index 14, 15 for odd row from xbuff
and the hex
value 0x00
in xoffsets
selects index 0, 1 for even row and index 2, 3 for odd row
from xbuff
.
There is another xsquare
parameter
to perform 16 bits granularity twiddling after the main permute. For example,
xsquare
value 0x2103
(see from lower hex value to higher hex value) puts index 3, 0
in the even row and index 1, 2 in the odd row. How the xsquare
parameter works can be seen in the center of the following
figure.
The following figure is an example of mac16
intrinsic of int16 and int16. It is used in the matrix vector multiplication and
matrix multiplication example designs in Single Kernel Coding Examples.