When using a stream based API, the architecture uses internal vector registers to store data samples, instead of window buffers, which removes the limiting factors of the window-based equivalent architecture.
However, the internal vector register is only 1024-bits wide, which greatly limits the amount of data samples each FIR kernel can operate on.
In addition, data registers storage capacity will be affected by decimation factors, when a Decimation FIR is used.
As a result, the number of taps each AIE kernel can process, limited by the capacity of the input vector register, depends on a variety of factors, like data type, coefficient type, and decimation factor.
To help find the number of FIR kernels required (or desired) to implement requested FIR length, refer to the helper functions: Minimum Cascade Length, Optimum Cascade Length described below.