The number of input/output ports created by a FIR will be given by the formula:
- Number of input ports:
NUM_INPUT_PORTS = TP_PARA_DECI_POLY x TP_SSR x (TP_DUAL_IP + 1)
- Number of output ports:
NUM_OUTPUT_PORTS = TP_PARA_INTERP_POLY x TP_SSR x TP_NUM_OUTPUTS
Therefore, the maximum throughput achievable for a given data type, e.g., cint16 and 1 GHz AIE Clock, can be estimated with:
- maximum theoretical sample rate at input:
THROUGHPUT_IN = NUM_INPUT_PORTS x 1 GSa/s
, - maximum theoretical sample rate at output:
THROUGHPUT_OUT = NUM_OUTPUT_PORTS x 1 GSa/s
.
AIE Tile Utilization Ratio
A Super Sample Rate operation creates multiple computation paths that are used to produce the output samples. Having multiple computation paths reduces the amount of computation required by each kernel.
The total number of FIR computation paths can be described with the following formula:
NUMBER_OF_COMPUTATION_PATHS = TP_CASC_LEN * TP_SSR * TP_PARA_INTERP_POLY * TP_PARA_DECI_POLY
The FIR graph will try to split the requested FIR workload among the FIR kernels equally, which can mean that each kernel is tasked with a comparatively low computational effort.
In such a scenario, the bandwidth will be limited by the amount of ports, but the AIE tile utilization ratio (often defined as ratio of VMAC operations to cycles without VMAC operation) might be reduced.
For example, a 32 tap Single Rate FIR operating on a cint16
data type and int16
coefficients with TP_SSR
set to 2 and a cascade length TP_CASC_LEN
set to 2 will perform at the bandwidth close to 2 GSa/s (2 output stream paths). Each of the kernels will be tasked with computing only eight coefficients. The design will use eight FIR kernels mapped to eight AIE tiles to achieve that.
However, a similarly configured FIR, a 32 tap Single Rate FIR operating on cint16
data type and int16
coefficients with TP_SSR
set to 2, but without further cascade configuration (TP_CASC_LEN
set to 1) would also perform at the bandwidth close to 2 GSa/s but only consume four kernels to achieve that.
Rate-changing FIR Throughput