The AI Engine compiler does not limit the buffer ports to a one-to-one connection. In certain circumstances the same output buffer can be used by other kernels to perform various tasks. You can connect a producer to as many consumers as needed. The AI Engine compiler will automatically infer a MM2S DMA to read the output buffer and as many S2MM DMAs as there are consumers to write to their respective input buffers.
private:
adf::kernel mk;
adf::kernel tk0,tk1,tk2,tk3;
...
connect net0 ( mk.out[0] , tk0.in[0] );
connect net1 ( mk.out[0] , tk1.in[0] );
connect net2 ( mk.out[0] , tk2.in[0] );
connect net3 ( mk.out[0] , tk3.in[0] );
...
dimensions(tk0.in[0]) = {128};
dimensions(tk1.in[0]) = {128};
dimensions(tk2.in[0]) = {128};
dimensions(tk3.in[0]) = {128};
Kernel function prototypes:
tk0(input_buffer<int32, adf::extents<adf::inherited_extent>> & in0,
output_buffer<int32, adf::extents<OUTPUT_SAMPLE_SIZE>> & out0);
tk1(input_buffer<int32, adf::extents<adf::inherited_extent>,
adf::margin<32>> & in0,
output_buffer<int32, adf::extents<OUTPUT_SAMPLE_SIZE>> & out0);
tk2(input_buffer<int32, adf::extents<adf::inherited_extent>,
adf::margin<64>> & in0,
output_buffer<int32, adf::extents<OUTPUT_SAMPLE_SIZE>> & out0);
tk3(input_buffer<int32, adf::extents<adf::inherited_extent>> & in0,
output_buffer<int32, adf::extents<OUTPUT_SAMPLE_SIZE>> & out0);
The input buffer to kernels
tk0
,
tk1
, tk2
, and
tk3
are served at the same time. This is
because the output buffer of the kernel mk
is read
only once. The slight delay variation is only due to the different AXI4-Stream path taken to route from the maker to the
various takers in the AI Engine array.Figure 1. One Kernel Serving Four Kernels
In the code, the same kernel output is connected to four different kernel
inputs. The AI Engine compiler adds DMAs between kernels so that the content of
buf5(d)
buffer can be copied into the other
buffers using the AXI4-Stream interconnect
network.