Stream output allows computed data samples to be sent directly over the stream without the requirement for a ping-pong window buffer. As a result, memory use and latency are reduced. Furthermore, the streaming output allows data samples to be broadcast to multiple destinations.
To maximize the throughput, FIRs can be configured with two output stream ports. However, this might not improve performance if the throughput is limited by other factors, i.e., the input stream bandwidth or the vector processor.
Set the TP_NUM_OUTPUTS template parameter to 2, to create a FIR kernel with two output stream ports.
In this scenario, the output data from the two streams is split into chunks of 128-bits. E.g.:
- samples 0-3 to be sent over an output stream 0 for cint16 data type,
- samples 4-7 to be sent over an output stream 1 for cint16 data type.