Organize Computation for a 2.5 GSPS Data Stream in Two Phases - Organize Computation for a 2.5 GSPS Data Stream in Two Phases - 2025.2 English - XD100

Vitis Tutorials: AI Engine Development (XD100)

Document ID
XD100
Release Date
2026-03-27
Version
2025.2 English

For a single-rate filter, a 2.5 GSPS input sample rate also means a 2.5 GSPS output sample rate. Because the system separates the input stream into two (even, odd) streams, the output stream splits the same way.

Take a look at how y0 is computed:

Y0Compute

If the data stream is split into two phases, you can see that the coefficients must also be split into two phases.

Y0Compute2Phases

Also take a look at how y2 is computed:

Y2Compute

Y2Compute2Phases

For the even output stream, data and coefficient phases must match:

  • Even data phase sent through a filter built with the even phase coefficients

  • Odd data phase sent through a filter built with the odd phase coefficients

Take a look at how this is modified for the odd outputs:

YoddCompute

In this case, the system mixes the phases of the data and coefficients:

  • Even data phase sent through a filter built with the odd phase coefficients

  • Odd data phase sent through a filter built with the even phase coefficients

There is another difference between the two. In the odd output case, they (even data, odd coefficients) should discard one data at the beginning of the stream.

In the previous section, the balance between data transfer and compute performance of the AI Engine was obtained for a 1.25 GSPS data stream going through an eight tap filter. The balance is identical here. Eight different filters can process 4x 1.25 GSPS streams in parallel.

The system splits the data stream and the coefficients into four phases and then recombines them. In the following figures, the various colors correspond to a different phase for the data (blue) and the coefficients(red):

  • Output phase 0, splits and recombines as follows: Phase0Out Phase0OutDetail

  • Output phase 1, splits and recombines as follows: Phase1Out Phase1OutDetail

  • Output phase 2, splits and recombines as follows: Phase2Out Phase2OutDetail

  • Output phase 3, splits and recombines as follows: Phase3Out Phase3OutDetail

When splitting the data and coefficients into N Phases (four in this case), the resulting architecture requires NPhases x NPhases (4x4 = 16) kernels.