Stream output allows computed data samples to be sent directly over the stream without the requirement for a ping-pong window buffer. As a result, memory use and latency are reduced. Furthermore, the streaming output allows data samples to be broadcast to multiple destinations.
To maximize the throughput, FIRs can be configured with 2 output stream ports. However, this may not improve performance if the throughput is limited by other factors, i.e., the input stream bandwidth or the vector processor.
Set TP_NUM_OUTPUTS
template parameter to 2, to create a FIR kernel with 2 output stream ports.
In this scenario, the output data from the two streams is split into chunks of 128-bits. E.g.:
- samples 0-3 to be sent over output stream 0 for cint16 data type,
- samples 4-7 to be sent over output stream 1 for cint16 data type.