Flow Control - 1.0 English - PG390

RFSoC DFE Fast Fourier Transform LogiCORE IP Product Guide (PG390)

Document ID
PG390
Release Date
2024-05-30
Version
1.0 English

The data input, data output, and control input interfaces all use a standard AXI4-Stream handshake interface to control the transfer of data and control words.

You must write a control word on the control packet interface to determine the parameters for each input data block. The core contains a small control FIFO buffer with 16 entries so that control words can be written before the start of the corresponding data blocks if required. When the control buffer is full, the core will drive s_axis_ctrl_tready Low to indicate that no further control words can be accepted.

When the DFE FFT core is generated with support for multiple channels, configuration and control parameters such as block size and transfer direction are common across all channels.

The core accepts a new data sample on any cycle on which s_axis_din_tready and s_axis_din_tvalid are both High. If new input data is not available on a given clock cycle, it is possible to pause the input operation by driving s_axis_din_tvalid Low for one or more clock cycles. However, be aware that doing this midway through an input block will stall the whole FFT processing pipeline, including any ongoing output operation, if the number of pause cycles inserted exceeds the capacity of the input data FIFO buffer (which is 16).

The core drives s_axis_din_tready High immediately after reset is released. Under normal circumstances, it will remain High while the core is operational. It will only be driven Low when the input data buffer is full. This can happen for three reasons:

  • If the output back-pressure option is enabled, holding m_axis_dout_tready Low while continuing to feed samples into the data input interface will cause FFT output to stall and the input buffer to fill.
  • Providing input sample data without a corresponding control word will cause the input buffer to fill while the core awaits control information before it can start a new FFT operation.
  • The internal FFT primitive does not start to accept data immediately after reset is deasserted. Providing data during the first 44 cycles after reset, before the FFT primitive is fully initialized, will cause the input buffer to fill.

Once the core has processed a block and is producing output data, it drives m_axis_dout_tvalid High for the output block duration. If the optional output flow-control feature is enabled, the output operation can be stalled by driving m_axis_dout_tready Low for one or more cycles during the output block. However, be aware that doing so will stall the whole FFT processing pipeline, including any ongoing input operation, if the number of pause cycles inserted exceeds the capacity of the input data buffer. If the optional output flow-control feature is disabled, data output will be continuous from the core for the duration of the block.

When the core has received a complete block of data, if the control FIFO is empty, then further input samples will be buffered. Once a new control word is written, FFT processing input can resume. To achieve 100% throughput, it is necessary to provide the control word for the each data block at the same time as the first sample of the block, or earlier.
Note: The core must receive a control word for every FFT operation it performs, regardless of whether the control parameters have changed from one operation to the next.

An example timing diagram illustrating the DFE FFT interfaces is shown below.

Figure 1. Timing Diagram Illustrating the DFE FFT Interfaces

The first part of the diagram shows the start of the input transfer for FFT block A. Because the control word is provided on the same cycle as the first data sample, processing can start immediately. On the last word of block A, the TLAST signal is driven high. The input transfer for block B starts on the following cycle. The control word for FFT block B was provided two cycles earlier than is strictly required to maintain 100% throughput.

The last part of the diagram shows the output transfer. For clarity, the bit-reversed address numbering on the TUSER output is not shown. The core drives the output TLAST signal high along with the final output sample of block A. Because blocks A and B were input consecutively with no gap between them, the output transfers are also back to back. The tabulated latency for the core is from the start of the input block to the end of the corresponding output block as shown.