Buffer Port Multicasting - 2023.2 English

AI Engine Kernel and Graph Programming Guide (UG1079)

Document ID
UG1079
Release Date
2023-12-04
Version
2023.2 English

The aiecompiler is not limited to a one to one connection of buffer ports. In certain circumstances the same output buffer might be used by multiple other kernels to perform various tasks. You can connect a producer to as many consumers that are needed. The aiecompiler will automatically infer a MM2S DMA to read the output buffer and as many S2MM DMAs as there are consumers to write to their respective input buffers.

private:
adf::kernel mk;
adf::kernel tk0,tk1,tk2,tk3;
...
connect net0 ( mk.out[0] , tk0.in[0] );
connect net1 ( mk.out[0] , tk1.in[0] );
connect net2 ( mk.out[0] , tk2.in[0] );
connect net3 ( mk.out[0] , tk3.in[0] );
...
dimensions(tk0.in[0]) = {128};
dimensions(tk1.in[0]) = {128};
dimensions(tk2.in[0]) = {128};
dimensions(tk3.in[0]) = {128};

Kernel function prototypes:

tk0(input_buffer<int32, adf::extents<adf::inherited_extent>> & in0, 
      output_buffer<int32, adf::extents<OUTPUT_SAMPLE_SIZE>> & out0);
tk1(input_buffer<int32, adf::extents<adf::inherited_extent>, 
                adf::margin<32>> & in0, 
        output_buffer<int32, adf::extents<OUTPUT_SAMPLE_SIZE>> & out0);
tk2(input_buffer<int32, adf::extents<adf::inherited_extent>, 
                adf::margin<64>> & in0, 
        output_buffer<int32, adf::extents<OUTPUT_SAMPLE_SIZE>> & out0);
tk3(input_buffer<int32, adf::extents<adf::inherited_extent>> & in0, 
        output_buffer<int32, adf::extents<OUTPUT_SAMPLE_SIZE>> & out0);
The input buffer to kernels tk0, tk1, tk2, and tk3 are served at the same time. This is because the output buffer of the kernel mk is read only once. The slight delay variation is only due to the different AXI4-Stream path taken to route from the maker to the various takers in the AI Engine array.
Figure 1. One Kernel Serving Four Kernels

In the code, the same kernel output is connected to four different kernel inputs. The aiecompiler automatically adds DMAs in between all kernels so that the content of buf5(d) buffer can be copied onto all other buffers using the AXI4-Stream interconnect network.