Design Analysis - 2025.2 English - UG1079

AI Engine Kernel and Graph Programming Guide (UG1079)

Document ID
UG1079
Release Date
2025-11-26
Version
2025.2 English

The following equation describes the finite impulse response (FIR) filter. x denotes the input, C denotes the coefficients, y denotes the output, and N denotes the length of the filter.

Following is an example of a 32-tap filter.

Each output takes 32 multiplications. If you use cint16 for data and coefficient types, the kernel needs four cycles to compute a sample. Each AI Engine performs eight MAC operations per cycle. If data is streaming from one stream port (32 bits), one data can produce one output (in the middle of processing).

So, the design is compute bound. You can split the kernel into four cascaded kernels to process one sample per cycle.