The AI Engine processes data block by block and uses a data structure called window to describe one block of input or output data. The window size, which is the number of samples in each block of input or output data, represents a tradeoff between efficiency and processing latency. Long windows lead to high efficiency, but the latency increases proportionally to the window size. Sometimes a short latency is preferable at a loss of a 510% AI Engine processing efficiency.
For example, in this application note, the input window size is set to 512 samples to limit the latency to within 2.1 μs. The window sizes and sample rates of DDC filters are listed in the following table.
Filter  Input Sample Rate (MSPS)  Output Sample Rate (MSPS)  Input Window  Output Window 

HB47  245.76  122.88  512  256 
HB11  122.88  61.44  256  128 
FIR199  122.88  122.88  256  256 
HB23  61.44  30.72  128  64 
FIR89  30.72  30.72  64  64 
Mixer  122.88  122.88  256  1280/5x carriers 
A cycle budget is the number of instruction cycles a function can take to compute a block of output data, given by:
At a 1 GHz AI Engine clock in the lowest speedgrade device, the processing of 512 samples at 245.76 MSPS has a cycle budget of 2083 cycles.
Suppose every output needs P 16bitreal by 16bitreal multiplications. The AI Engine can compute 32 such realbyreal multiplications every cycle. For an ideal implementation, the utilization lower boundary is given by:
Take FIR199 as an example. FIR199 has 199 real symmetric filter taps and it takes 100 16 bitcomplex by 16 bitreal multiplications to compute each output. Therefore, every output of FIR199 needs 200 16bitreal by 16bitreal multiplications at 122.88 MSPS, and the utilization lower boundary is given by 200 cycles × 256 samples / (32 × 2083 cycle budget) = 76.8%. Similarly, the utilization lower bounds of other DDC filters are calculated and listed in the following table.
Filter  Input Window Size  Output Window Size  Number of Taps  Number of MACs/Output  Utilization / Instance  Number of Inst  Utilization Lower Bound 

FIR199  256  256  199  200  76.8%  1  76.8% 
FIR89  64  64  89  96  9.3%  5  46.5% 
HB47  512  256  47  32  12.3%  1  12.3% 
HB11  256  128  11  8  1.6%  5  8% 
HB23  128  64  23  16  1.6%  5  8% 
Mixer^{ 1 }  256  1280    8  23%  1  23% 
Total  174.6%  

Although in theory this DDC can be implemented on two AI Engines with 87.3% utilization each, such high utilization requires very long windows and undesirable latency. One method to reduce the utilization is to take advantage of the fact that 5G NR and 4G LTE carriers do not coexist in this case and the filters for unused carriers can be disabled during run time, depending on the carrier configuration. Detailed analysis and explanation is provided in the following sections.