Fast Fourier Transform - 2025.2 English - UG1603

AI Engine-ML Kernel and Graph Programming Guide (UG1603)

Document ID
UG1603
Release Date
2025-11-26
Version
2025.2 English

The AI Engine API contains a full set of stage-based interfaces to build decimation-in-time Fast Fourier Transform (FFT).

For more information about these APIs and supported parameters, see AI Engine API User Guide (UG1529).

The stage depends on the radix which is implemented (2, 3, 4, or 5) and the various data types for input and output data, and twiddle factors. The implemented API are as follows:

Table 1. Supported FFT/IFFT Modes
Input Type Output Type Twiddle Type AIE-ML Supported Radices AIE-ML V2 Supported Radices
c16b c16b c16b 2,3,4,5 2, 4
c16b c32b c16b 2,3,4,5 2, 4
c32b c16b c16b 2,4 2, 4
c32b c32b c16b 2,3,4,5 2, 4
cbfloat16 cbfloat16b cbfloat16b 2,4 NA

The function name depends only on the radix:

void fft_dit_r2_stage(...);  // Radix 2
void fft_dit_r3_stage(...);  // Radix 3
void fft_dit_r4_stage(...);  // Radix 4
void fft_dit_r5_stage(...);  // Radix 5

The parameters are similar from one function to the other. For a radix-N stage function:

void aie::fft_dit_rN_stage (const Input *__restrict 	x,
const Twiddle *__restrict 	tw0,
[ const Twiddle *__restrict 	tw1, ... , ]unsigned 	n,
[ unsigned 	shift_tw, unsigned    shift, ]

bool 	inv,
Output *__restrict 	out 
)

Parameters

x
Input data pointer.
tw0
First twiddle group pointer.
tw1, ...
(Depending on Radix) A Radix N needs N-1 twiddle group pointers.
n
Number of samples.
shift_tw, shift
(Optional) shift_tw is the decimal point applied to the twiddle factors, and shift is the shift applied to the dit output.
inv
0 for direct FFT, 1 for inverse FFT.
out
Output data pointer.

Template Parameters

Input
Type of the Input parameter
Output
Type of the Output parameter
Twiddle
Type of the Twiddle parameter

An example of a 1024-point FFT is as follows:

void fft_1024pt (const cint16 * __restrict x,  // Input pointer
                unsigned shift_tw,            // Indicates the decimal point of the twiddles
                                              // e.g. The twiddle 1.0+0.0i can be represented with cint16(32767, 0) and a shift_tw of 15
                unsigned shift,               // Shift applied to apply to dit outputs
                bool inv,                     // Run inverse FFT
                cint16 * __restrict tmp,      // Scratch space for intermediate results
                cint16 * __restrict y         // Output pointer
               )
{
    aie::fft_dit_r2_stage<512>(x,   tw1,   1024, shift_tw, shift, inv, tmp);
    aie::fft_dit_r2_stage<256>(tmp, tw2,   1024, shift_tw, shift, inv, y);
    aie::fft_dit_r2_stage<128>(y,   tw4,   1024, shift_tw, shift, inv, tmp);
    aie::fft_dit_r2_stage<64> (tmp, tw8,   1024, shift_tw, shift, inv, y);
    aie::fft_dit_r2_stage<32> (y,   tw16,  1024, shift_tw, shift, inv, tmp);
    aie::fft_dit_r2_stage<16> (tmp, tw32,  1024, shift_tw, shift, inv, y);
    aie::fft_dit_r2_stage<8>  (y,   tw64,  1024, shift_tw, shift, inv, tmp);
    aie::fft_dit_r2_stage<4>  (tmp, tw128, 1024, shift_tw, shift, inv, y);
    aie::fft_dit_r2_stage<2>  (y,   tw256, 1024, shift_tw, shift, inv, tmp);
    aie::fft_dit_r2_stage<1>  (tmp, tw512, 1024, shift_tw, shift, inv, y);
}