Filterbank Library Characterization - Filterbank Library Characterization - 2025.2 English - XD100

Vitis Tutorials: AI Engine Development (XD100)

Document ID
XD100
Release Date
2026-03-27
Version
2025.2 English

Based on the preceeding analysis, we learned that our filterbank is storage-bound, requiring 32 tiles. We can instantiate the TDM FIR IP based on the following configuration. For more information on the definition of these parameters, refer to Vitis Libraries.

  typedef cint16                      TT_DATA;
  typedef cint32                      TT_OUT_DATA;
  typedef int32                       TT_COEFF;
  static constexpr unsigned           TP_FIR_LEN            = 36;
  static constexpr unsigned           TP_SHIFT              = 31;
  static constexpr unsigned           TP_RND                = 12;
  static constexpr unsigned           TP_NUM_OUTPUTS        = 1;
  static constexpr unsigned           TP_DUAL_IP            = 0;
  static constexpr unsigned           TP_SAT                = 1;
  static constexpr unsigned           TP_TDM_CHANNELS       = 4096;
  static constexpr unsigned           TP_SSR                = 32;
  static constexpr unsigned           TP_INPUT_WINDOW_VSIZE = 4096;
  static constexpr unsigned           TP_CASC_LEN           = 1;

We can characterize its performance to confirm it works as expected.

[shell]% cd <path-to-design>/aie/tdm_fir_characterize
[shell]% make clean all
[shell]% vitis_analyzer aiesimulator_output/default.aierun_summary

Inspecting vitis_analyzer, we observe that the design uses more tiles than expected (64 vs 32 predicted).

figure5

Zooming in to one of the tiles, we observe that the state history is stored with the input window, which is double-buffered. This causes the storage requirement to increase beyond the predicted 32 tiles. This observation is specific to the TDM FIR IP on AIE-ML.

figure6

We also observe that the achieved throughput is higher than the requirement, 4096/1.257 = 3258 MSPS.

figure7

Below is a table summary of predicated vs actual resources with a note on what could be done to bring down resources closer to predicated levels.

Predicted

Actual

Notes

AI Engine Tiles

32

64

Use single_buffer on input ports + tight placement constraints

PLIOs (in/out)

2/4

32/32

Use new Packet Switching IP

Throughput (Gsps)

>2

3.3

Usage of above will result in some throughput degradation