The following figure shows the sub-components of the scalar unit. The scalar unit is used for program control (branch, comparison), scalar math operations, non-linear functions, and data type conversions much like a general-purpose processor. Similar to a general-purpose processor, generic C/C++ code can be used.
The register files are used to store input and output. There are dedicated registers for pointer arithmetic, as well as for general-purpose usage and configuration. Special registers include stack pointers, circular buffers, and zero overhead loops. Two types of scalar elementary non-linear functions are supported in the AI Engine, fixed-point and floating-point precisions.
Fixed-point, non-linear functions include:
- Sine and cosine
- Absolute value (ABS)
- Count leading zeros (CLZ)
- Comparison to find minimum or maximum (lesser than (LG)/greater than (GT))
- Square root
- Inverse square root and inverse
Floating-point, non-linear functions include:
- Square root
- Inverse square root
- Inverse
- Absolute value (ABS)
- Comparison to find minimum or maximum (lesser than (LG)/greater than (GT))
The arithmetic logic unit (ALU) in the AI Engine manages the following operations with an issue rate of one instruction per cycle.
- Integer addition and subtraction of 32 bits. The operation has a one-cycle latency.
- Bit-wise logical operation on 32-bit integer numbers (BAND, BOR, and BXOR). The operation has a one-cycle latency.
- Integer multiplication: 32 x 32-bit with an output result of 32 bits stored in the R register file. The operation has a three-cycle latency.
- Shift operation: Both left and right shift are supported. The operation has a one-cycle latency.
Data type conversion can be done using float2fix
and fix2float
. This
conversion can also support sqrt, inv, and inv_sqrt fixed-point operations.
Scalar Programming
The compiler and scalar unit provide the programmer the ability to use standard āCā data types. The following table shows standard C data types with their precisions. All types except float and double support signed and unsigned prefixes.
Data Type | Precision | Comment |
---|---|---|
char | 8-bit signed | |
short | 16-bit signed | |
int | 32-bit signed | Native support |
long | 64-bit signed | |
float | 32-bit | |
double | 64-bit | Emulated using softfloat library. Scalar proc does not contain FPU. |
It is important to remember that control flow statements such as branching are still handled by the scalar unit even in the presence of vector instructions. This concept is critical to maximizing the performance of the AI Engine.