Load and Store Using Buffer Streams - 2025.2 English - UG1603

AI Engine-ML Kernel and Graph Programming Guide (UG1603)

Document ID
UG1603
Release Date
2025-11-26
Version
2025.2 English

If iterators are not appropriate due to semantic or performance concerns, buffer streams can be used instead. Unlike iterators, Buffer streams follow a stream interface, which implies that a read or write operation advances the stream, altering its current state.

These buffer streams are available for various datatypes:

  • Standard types: (u)int8, (u)int16, (u)int32, bfloat16...
  • Sparse vectors of standard type.
  • Block-floating-point data.
  • Tensor of native types to handle multidimensional addressing.

Input and Output Buffer Stream

In the function signature there is a standard buffer defined using input_buffer and output_buffer. To access this buffer, you need to use a specific construct called a buffer_stream. For example:

void TestFunction(adf::input_buffer<int32,adf::extents<1024>> & bin,
adf::output_buffer<int32,adf::extents<1024>> & bout)
{
  aie:input_buffer_stream<int32,32> in_bufs((int32 *)bin.data());
  aie:output_buffer_stream<int32,32> out_bufs((int32 *)bout.data());
  aie::vector<int32,32> v;

  for(int i=0;i<1024/32;i++)
  {
    in_bufs >> v; // Get the next vector from the buffer bin
    out_bufs << v; // Stores variable v into the next available address in buffer bout
  }
}

Block Floating-Point Buffer Stream

Block Floating-Point data types have different memory footprints, resulting in different memory alignment.
Table 1. Block Floating-Point Data Types: Memory Footprint for various Vector Sizes
MX Type Block (16 values) 64 values 128 values 256 values
  size in bytes size in 128-bit words size in 128-bit words size in 128-bit words
mx9 18 4.5 9 18

When working with block-floating-point data, it is necessary to meet specific data alignment requirements. The only way to access a buffer and load or store this data is by using a buffer stream.

void MxPassthrough(adf::input_buffer<mx9,adf::extents<32>> & bin,
adf::output_buffer<mx9,adf::extents<32>> & bout)
{
  aie:block_vector_input_buffer_stream<mx9,64> in_bufs((mx9 *)bin.data());
  aie:block_vector_output_buffer_stream<mx9,64> out_bufs((mx9 *)bout.data());
  aie::block_vector<mx9,64> v;

  for(int i=0;i<(32*16)/64;i++)
  {
    in_bufs >> v; // Get the next vector from the buffer bin
    out_bufs << v; // Stores variable v into the next available address in buffer bout
  }
}
Note: In 2025.1 the size specified in the extents of the input_buffer or the output_buffer is the number of blocks encoding 16 floating-point values. This will change in future where the size will be indicated in number of encoded floating-point values. This number will necessarily be a multiple of 16. This change will be coherent with the size indicated in the block_vector and block_vector_[input|output]_buffer API.
Load from a Buffer Stream
To load a variable from the buffer, you can use the >>operator (similar to stream access in C++).
Store to a Buffer Stream
To store a variable to the buffer, you can use the << operator (similar to stream access in C++).

Both the operator >>and class function pop() retrieve the value from the stream and advance the stream's state.

Vectors are not converted to sparse vectors internally, therefore there is no sparse_vector_output_buffer class.

Tensor Buffer Streams

An abstraction technique is applied to address a single dimention stream in a multidimensional address map. The multidimensional access scheme is described using aie::make_tensor_descriptor<DataType,VectorLength>. For example:

auto desc = aie::make_tensor_descriptor<int16, 32>(
                     aie::tensor_dim(2u, 4),
                     aie::tensor_dim(2u, 2),
                     aie::tensor_dim(2u, 1));

The three parameters aie::tensor_dim(NbElem, Step) describe the multidimensional volume on dimension 2, 1 and 0.

NbElem
The number of elements to consider sequentially on the corresponding dimension.
Step
The address incremental steps to use on the corresponding dimension relative to the first element.

More detailed information is available in Multi-dimensional Addressing in AI Engine Kernels.

Note: Tensor Buffer Streams are not available for Block Floating-Point Data Types: MX9.

Sparse-Vector Buffer Stream

Sparse vectors do not have a fixed length in memory, therefore they cannot be accessed through standard iterators. Instead, a variant of the buffer stream class can be used to load them:

aie::sparse_vector_input_buffer_stream<Type,Size> StreamName(Type * DataPointer);
Note: Sparse-Vector Buffer Streams are not available for Block Floating-Point Data Types: MX9.