The streamNToOne API
is designed for collecting data from multiple processor units.
Three different algorithms have been implemented, RoundRobinT,
LoadBalanceT and TagSelectT.
To ensure the throughput, it is very common to pass a vector of elements in
FPGA data paths, so streamNToOne supports element vector output, if the
data elements are passed in the form of ap_uint.
It also offers overload for generic template type for non-vector output.