Overview - 2024.2 English

Vitis Libraries

Document ID
XD160
Release Date
2024-11-29
Version
2024.2 English

fft_dit_1ch is a single-channel, decimation-in-time FFT.

These are the templates to configure the single-channel decimation-in-time class.

Parameters:

TT_DATA

describes the type of individual data samples input to the transform function.

This is a typename and must be one of the following:

cint16, cint32, cfloat. For real-only operation, consider use of the widget_real2complex library element.

TT_TWIDDLE

describes the type of twiddle factors of the transform.

It must be one of the following: cint16, cint32, cfloat and must also satisfy the following rules:

  • TT_TWIDDLE must be an integer type if TT_DATA is an integer type
  • TT_TWIDDLE must be cfloat type if TT_DATA is a float type.
TP_POINT_SIZE

is an unsigned integer which describes the number of samples in the transform.

This must be 2^N where N is an integer in the range 4 to 16 inclusive.

When TP_DYN_PT_SIZE is set, TP_POINT_SIZE describes the maximum point size possible.

TP_FFT_NIFFT selects whether the transform to perform is an FFT (1) or IFFT (0).
TP_SHIFT selects the power of 2 to scale the result by prior to output.
TP_CASC_LEN selects the number of kernels the FFT will be divided over in series to improve throughput
TP_DYN_PT_SIZE

selects whether (1) or not (0) to use run-time point size determination.

When set, each window of data must be preceeded, in the window, by a 256 bit header.

This header is 8 samples when TT_DATA is cint16 and 4 samples otherwise.

The real part of the first sample indicates the forward (1) or inverse (0) transform.

The real part of the second sample indicates the Radix2 power of the point size.

e.g. for a 512 point size, this field would hold 9, as 2^9 = 512. Any value below 4 or greater than log2(TP_POINT_SIZE) is considered illegal.

The output window will also be preceeded by a 256 bit vector which is a copy of the input

vector, but for the real part of the top sample, which is 0 to indicate a legal frame or 1 to

indicate an illegal frame.

When TP_PARALLEL_POWER is greater than 0, the header must be applied before each window of data

for every port of the design and will appears before each window of data on the output ports.

Note that the minimum point size of 16 applies to each lane when in parallel mode, so a configuration

of point size 256 with TP_PARALLEL_POWER = 2 will have 4 lanes each with a minimum of 16 so the minimum

legal point size here is 64.

TP_WINDOW_VSIZE

is an unsigned integer which describes the number of samples to be processed in each call

to the function. When TP_DYN_PT_SIZE is set to 1 the actual window size will be larger than TP_WINDOW_VSIZE

because the header is not included in TP_WINDOW_VSIZE.

By default, TP_WINDOW_SIZE is set to match TP_POINT_SIZE.

TP_WINDOW_SIZE may be set to be an integer multiple of the TP_POINT_SIZE, in which case multiple FFT iterations will be performed on a given input window, resulting in multiple iterations of output samples, reducing the numer of times the kernel needs to be triggered to process a given number of input data samples.

As a result, the overheads inferred during kernel triggering are reduced and overall performance is increased.

TP_API

is an unsigned integer to select window (0) or stream (1) interfaces.

When stream I/O is selected, one sample is taken from, or output to, a stream and the next sample from or two the next stream. Two streams minimum are used. In this example, even samples are read from input stream[0] and odd samples from input stream[1].

TP_PARALLEL_POWER

is an unsigned integer to describe N where 2^N is the numbers of subframe processors to use, so as to achieve higher throughput.

The default is 0. With TP_PARALLEL_POWER set to 2, 4 subframe processors will be used, each of which takes 2 streams or one iobuffer in for a total of 8 streams or 4 iobuffers input and output. Sample[p] must be written to stream or iobuffer number [p modulus q] where q is the number of ports. A reciprocal operation must be performed on output ports.

TP_USE_WIDGETS

is an unsigned integer to control the use of widgets for configurations which either use TP_API=1 or TP_PARALLEL_POWER>0.

Designs with streaming IO (TP_API=1) and/or multiple subframe processors (TP_PARALLEL_POWER>0) will use streams internally, even if not externally.

The default is not to use widgets but to have the stream to window conversion performed as part of the FFT kernel or R2combiner kernel. Using widget kernels allows this conversion to be placed in a separate tile and so boost performance at the expense of more tiles being used.

TP_RND

describes the selection of rounding to be applied during the shift down stage of processing.

Although, TP_RND accepts unsigned integer values descriptive macros are recommended where

  • rnd_floor = Truncate LSB, always round down (towards negative infinity).

  • rnd_ceil = Always round up (towards positive infinity).

  • rnd_sym_floor = Truncate LSB, always round towards 0.

  • rnd_sym_ceil = Always round up towards infinity.

  • rnd_pos_inf = Round halfway towards positive infinity.

  • rnd_neg_inf = Round halfway towards negative infinity.

  • rnd_sym_inf = Round halfway towards infinity (away from zero).

  • rnd_sym_zero = Round halfway towards zero (away from infinity).

  • rnd_conv_even = Round halfway towards nearest even number.

  • rnd_conv_odd = Round halfway towards nearest odd number.

    Note that the FFT does not support floor and ceiling forms of rounding as these result in very poor accuracy due to the form of internal calculations. Rounding modes differ only in how they round for values of 0.5.

TP_SAT

describes the selection of saturation to be applied during the shift down stage of processing.

TP_SAT accepts unsigned integer values, where:

  • 0: none = No saturation is performed and the value is simply cropped (aliased).
  • 1: saturate = Default. Saturation rounds an n-bit signed value in the range [- ( 2^(n-1) ) : +2^(n-1) - 1 ].
  • 3: symmetric = Controls symmetric saturation. Symmetric saturation rounds an n-bit signed value in the *range [- ( 2^(n-1) -1 ) : +2^(n-1) - 1 ].
TP_TWIDDLE_MODE

describes the magnitude of integer twiddles. It has no effect for cfloat.

  • 0: Max amplitude. Values at 2^15 (for TT_TWIDDLE=cint16) and 2^31 (TT_TWIDDLE=cint32) will saturate and so *introduce errors
  • 1: 0.5 amplitude. Twiddle values are 1/2 that of mode 0 so as to avoid twiddle saturation. However, *twiddles are one bit less precise versus mode 0.
TT_DATA_OUT

describes the type of individual data samples output from the transform function.

This is a typename and must be cint16 / cint32 if TT_DATA is cint16 / cint32 or cfloat if TT_DATA is cfloat. For real-only operation, consider use of the widget_real2complex library element.

TP_INDEX

This parameter is for internal use regarding the recursion of the parallel power feature.

It is recommended to miss this parameter from the configuration and rely instead on default values. If this parameter is set by the user, the behaviour of the library unit is undefined.

TP_ORIG_PAR_POWER

This parameter is for internal use regarding the recursion of the parallel power feature.

It is recommended to miss this parameter from the configuration and rely instead on default values. If this parameter is set by the user, the behaviour of the library unit is undefined.

template <
    typename TT_DATA,
    typename TT_TWIDDLE,
    unsigned int TP_POINT_SIZE,
    unsigned int TP_FFT_NIFFT = 1,
    unsigned int TP_SHIFT = 0,
    unsigned int TP_CASC_LEN = 1,
    unsigned int TP_DYN_PT_SIZE = 0,
    unsigned int TP_WINDOW_VSIZE = TP_POINT_SIZE,
    unsigned int TP_API = 0,
    unsigned int TP_PARALLEL_POWER = 0,
    unsigned int TP_USE_WIDGETS = 0,
    unsigned int TP_RND = 4,
    unsigned int TP_SAT = 1,
    unsigned int TP_TWIDDLE_MODE = 0,
    typename TT_OUT_DATA = TT_DATA,
    unsigned int TP_INDEX = 0,
    unsigned int TP_ORIG_PAR_POWER = TP_PARALLEL_POWER
    >
class fft_ifft_dit_1ch_graph: public graph

// fields

static constexpr int kParallel_factor
static constexpr int kWindowSize
static constexpr int kNextParallelPower
static constexpr int kR2Shift
static constexpr int kFFTsubShift
static constexpr int kHeaderBytes
static constexpr int kStreamsPerTile
static constexpr int kPortsPerTile
static constexpr int kOutputPorts
port_array <input, kPortsPerTile*kParallel_factor> in
port_array <output, kOutputPorts> out
parameter r2comb_tw_lut
kernel m_combInKernel[kParallel_factor]
kernel m_r2Comb[kParallel_factor]
kernel m_combOutKernel[kParallel_factor]
fft_ifft_dit_1ch_graph <TT_DATA, TT_TWIDDLE, (TP_POINT_SIZE>> 1), TP_FFT_NIFFT, kFFTsubShift, TP_CASC_LEN, TP_DYN_PT_SIZE, (TP_WINDOW_VSIZE>> 1), TP_API, kNextParallelPower, TP_USE_WIDGETS, TP_RND, TP_SAT, TP_TWIDDLE_MODE, TT_OUT_DATA, TP_INDEX, TP_ORIG_PAR_POWER> FFTsubframe0
fft_ifft_dit_1ch_graph <TT_DATA, TT_TWIDDLE, (TP_POINT_SIZE>> 1), TP_FFT_NIFFT, kFFTsubShift, TP_CASC_LEN, TP_DYN_PT_SIZE, (TP_WINDOW_VSIZE>> 1), TP_API, kNextParallelPower, TP_USE_WIDGETS, TP_RND, TP_SAT, TP_TWIDDLE_MODE, TT_OUT_DATA, TP_INDEX+kParallel_factor/2, TP_ORIG_PAR_POWER> FFTsubframe1