Design Analysis - 2024.2 English

AI Engine Kernel and Graph Programming Guide (UG1079)

Document ID
UG1079
Release Date
2024-11-28
Version
2024.2 English

A finite impulse response (FIR) filter is described by the following equation, where x denotes the input, C denotes the coefficients, y denotes the output, and N denotes the length of the filter.

Following is an example of a 32-tap filter.

Each output takes 32 multiplications. If you take cint16 as the data type and coefficient type, it takes 4 cycles to compute a sample in a kernel, because each AI Engine can perform 8 MAC operations a cycle. If data is streaming from one stream port (32 bits), one data can produce one output (in the middle of processing).

So, the design is compute bound. You will see how to split the kernel into 4 cascaded kernels to process one sample per cycle.