|
AI Engine-ML Intrinsics User Guide (v2024.2)
|
Intrinsics for moving values from accumulator data-types to vector data-types. More...
Overview
Intrinsics for moving values from accumulator data-types to vector data-types.
Moving data from accumulator data-types back to standard vector data-types requires a reduction in precision. For fixed-point arithmetic, an appropriate transformation involving shifting out lower order bits, rounding and/or saturation can be applied using the SRS family of intrinsics. The shift amount is specified as a parameter (in the range -4 to 59), while the rounding and saturation is applied based on global mode registers of the processor (see Mode Settings).
- Note
- The shift values -4..-2 are unsafe, as they will only produce correct result if truncation is selected or saturation against 0 is required.
There are three main variants of the SRS intrinsics based on width of input and output data-types:
- ssrs is used to convert integer
- 32-bit accumulator data into a corresponding 8-bit vector
- 64-bit accumulator data into a corresponding 16-bit vector
- lsrs is used to convert integer
- 32-bit accumulator data into a corresponding 16-bit vector
- 64-bit accumulator data into a corresponding 32-bit vector
- srs is used to convert floating-point accumulators into a corresponding bfloat16 vector
Both ssrs and lsrs modes can be prefixed with 'u' in which case the resulting datatype will be unsigned.
Example
Using the ssrs intrinsic the 32 accumulator lanes of a v32acc32 are shifted directly to the 32 output lanes of a v32int8. Each lane does a separate shifting, rounding and saturation (depending on the parameters):
As indicated in the name each SRS intrinsic performs three operations: Shifting (down, right), saturation and rounding. The first step is to compute saturation:
The rounding factor is then checked according to the selected rounding mode in Rounding modes. Finally, the shift is performed and the rounding factor is applied, as such:
The full srs call then applies the above algorithm to all lanes of a vector and sets the status saturation bit (if saturation is triggered):
- Note
- Saturation status is not cleared automatically. If set, it will remain set until the user clears the status bit.
Modules | |
| AIE interface | |
| Floating-point interface | |
| Size interface | |