An example of MAC with int16 X
buffer and int16 Z
buffer is as follows. Note that
the permute granularity for X
buffer is 32 bits.
The start
and step
parameters are always in terms of data type granularity. To get to a 16-bit index,
you need to multiply them by 2.
The xoffsets
parameter comes as a pair. The
first hex value is an absolute 32 bits offset and picks up 2 x 16 bits values
(index, index+1) in the even row. The second hex value is offset from first hex
value plus 1 (32 bits offset) and picks up 2 x 16 bits values in the odd row. So the
hex value 0x24
in xoffsets
selects index 8, 9 for even row and index 14, 15 for odd row
from xbuff
:
even: 2 * 4 -> get indices [8, 9]
odd: 2 * ( 2 + 4 + 1 ) -> get indices [14, 15]
Similarly, the hex value 0x00
in
xoffsets
selects index 0, 1 for even row and
index 2, 3 for odd row from xbuff
.
There is another xsquare
parameter
to perform 16 bits granularity twiddling after the main permute. It will give
additional contribution to the index in a 2 by 2 matrix recurring across the 8x4
matrix compute given by MUL8 in int16 x int16 mode.
For example, xsquare
value 0x2103
(see from lower hex value to higher hex value)
puts index 3, 0 in the even row and index 1, 2 in the odd row. How the xsquare
parameter works can be seen in the center of
the following figure.
The following figure is an example of mac16
intrinsic of int16 and int16.