Stream Data Types - 2024.1 English

AI Engine Kernel and Graph Programming Guide (UG1079)

Document ID
UG1079
Release Date
2024-06-05
Version
2024.1 English

Each of the data types in the table can be read or written from the AI Engine as either scalars or in vector groups. However, there are certain restrictions on valid groupings based on the bus data width supported on the AI Engine to programmable logic interface ports or through the stream-switch network. The valid combinations for AI Engine kernels are vector bundles totaling up to 32-bits or 128-bits. The accumulator data types are only used to specify cascade-stream connections between adjacent AI Engines. Its valid groupings are based on the 384-bit wide cascade channel between two processors.

Input and output Cascade data types can be used in the context of AI Engine APIs or ADF APIs. AMD recommends the use of AI Engine APIs because it supports a larger range of lanes. To use AI Engine APIs, include #include <aie_adf.hpp> in the kernel source code.

ADF APIs support a limited number of lanes. To use ADF APIs, include #include <adf.h> in the kernel source code. ADF APIs are used for advanced kernel programming using intrinsic calls.

Table 1. Stream Data Types
Input Stream Types Output Stream Types
input_stream<int8> output_stream<int8>
input_stream<int16> output_stream<int16>
input_stream<int32> output_stream<int32>
input_stream<int64> output_stream<int64>
input_stream<uint8> output_stream<uint8>
input_stream<uint16> output_stream<uint16>
input_stream<uint32> output_stream<uint32>
input_stream<uint64> output_stream<uint64>
input_stream<cint16> output_stream<cint16>
input_stream<cint32> output_stream<cint32>
input_stream<float> output_stream<float>
input_stream<cfloat> output_stream<cfloat>
Table 2. Cascade Accumulator Data Types
Input Cascade Types Output Cascade Types Lanes in ADF API (adf.h) Lanes in AIE API (aie_adf.hpp)
input_cascade<acc48> output_cascade<acc48> 8 8/16/32/64/128
input_cascade<cacc48> output_cascade<cacc48> 4 4/8/16/32/64
input_cascade<acc80> output_cascade<acc80> 4 4/8/16/32/64
input_cascade<cacc80> output_cascade<cacc80> 2 2/4/8/16/32
input_cascade<accfloat> output_cascade<accfloat> 8 4/8/16/32
input_cascade<caccfloat> output_cascade<caccfloat> 4 2/4/8/16
input_cascade<int8> output_cascade<int8> 32 16/32/64/128
input_cascade<uint8> output_cascade<uint8> 32 16/32/64/128
input_cascade<int16> output_cascade<int16> 16 8/16/32/64
input_cascade<int32> output_cascade<int32> 8 4/8/16/32
input_cascade<cint16> output_cascade<cint16> 8 4/8/16/32
input_cascade<cint32> output_cascade<cint32> 4 2/4/8/16
input_cascade<cfloat> output_cascade<cfloat> 4 2/4/8/16
input_cascade<float> output_cascade<float> 8 4/8/16/32