Stream Stall Analysis - 2025.2 English - UG1076

AI Engine Tools and Flows User Guide (UG1076)

Document ID
UG1076
Release Date
2025-11-20
Version
2025.2 English

You can identify if a stream stall needs to be analyzed from the Performance Metrics view. This view also shows the tile/tiles that are causing the stall.

The following steps illustrate how a stream stall can be analyzed starting in the Performance Metrics tab in the Vitis IDE.

  1. In the Performance Metrics view, select Stream Stall Time (%) to view stream stalls across all tiles. Identify the tile(s) to be analyzed.
    Note: Objects in Performance Metrics view can be cross-probed with Trace view, Graph view, and Array view. For example, selecting the tile in the Performance Metrics view to highlight the tile in Trace view can help quickly locate the tile.


  2. Select the Trace view.

  3. Select the Stream Stalls view. In the Stalls view, stream stalls display the following information. You can cross-probe objects in blue with other views by clicking on it.
    NAME
    The name of the stream stall is SS_<NUM>. The earlier the stall happens, the smaller the number. The number is unique across all types of stalls.
    Stalled Tile
    The AI Engine tile that contains the stalled kernel.
    Stalled Kernel
    The kernel that is stalled. It is named <Kernel_function_name>.<Schedule_ID>.<Graph_instance_name>. If it displays as _main, you need to cross-probe to find the real kernel function.
    Start (ps)
    The start time of the stall.
    Duration (ps)
    The duration of the stall.
    PC
    Program counter when the stall happens.
    Stalled Port
    The port of the stalled kernel.
    Related Stalls
    Other stalls that can cause the stall.
    Full Destination
    The port that the stalled kernel cannot write into because it is full.
    Empty Source
    The port that the stalled kernel cannot read from because it is empty.
  4. Click a stream stall in Stalls view to go to the start of the stall in the Trace view. Right-click the stall and select Filter Trace as needed. After filtering trace, the Trace view shows the signals related to the stall. The tool hides non-related signals. Exploring the trace using filter trace is clearer when the design is large.
  5. You can click and cross-probe objects in blue in the Stalls view. For example, if you click the kernel in Stalls view, it highlights the kernel in Trace view.
  6. Zoom in and out of the Trace view to explore the stalls. You can gain insight into why a stall occurs by analyzing the following items:
    • The position of the stall
    • The frequency of similar stalls
    • Events before the stall and related stalls (if any)
  7. To clear the previously filtered trace, right-click and select Clear All Filters.
  8. It is helpful to have an overview of the stall path in Graph view. Select Graph view and then select Tile View from the drop-down list.

  9. Select Stalls view and then select Stream Stalls from the drop-down list.
  10. Explore the stream stalls in the Stalls view. Select a stream stall in the Stalls view to display an overview of the stall in the graph. The red path shows where the stall occurs. The occurrence can be from a stalled kernel to full destination port, or from an empty source port to the stalled kernel.
    Tip: If a stream multicasts to multiple destinations and a stream stall occurs when the stream does not have enough FIFO for all destinations, the highlighted stalled kernel and stalled net cannot be connected (separately in red). This means that you must analyze all destinations of the multicast stream as a whole for the stream stall. The following figure shows an example of multicast stream stall in the Graph view.


  11. You can open graph source code or kernel source code from the Graph view or Array view. Select the kernel instance by clicking the kernel object in the Stalls view or clicking the kernel in the Graph view.

  12. Right-click the kernel instance in the Graph view, and select either Goto Graph Source or Goto Kernel Source. This selection opens either the graph or kernel source code.
  13. Correlate the graph source code and kernel source code with the stalls analyzed. Edit the source code as needed.

The following table lists some possible scenarios that can cause a stream stall. The table also provides possible solutions.

Table 1. Stream Stall Scenarios and Solutions
Source Destination Stall Type Possible Solution Notes
Stream Stream Stream stall
  • Increase FIFO depth. See FIFO Depth Constraints.
  • Adjust stream read and write instructions in source or destination kernels.
Stream Multiple streams Stream stall Multicast
Stream Multiple streams of multiple kernels in same AI Engine Stream stall
  • Put multiple kernels into different AI Engines
  • Add enough FIFO to streams to the kernels.
Multicast
Multiple streams Multiple streams Stream stall
  • Adjust instructions to match between different streams.
  • Increase FIFO depth (ssFIFO or DMA FIFO).
 
PLIO Stream Stream stall
  • Maximize AI Engine-PL interface bandwidth. For example, 64-bit interface, highest frequency (1/2 AI Engine frequency) for PL. Or 128-bit interface (note this uses two 64-bit channels for a 128-bit interface). See AI Engine-PL Interface Performance in AI Engine Kernel and Graph Programming Guide (UG1079).
Stream PLIO Stream stall Same as above.
Stream (32 bits per iteration) PLIO Stream stall Send TLAST for each 32 bits. See AI Engine-PL Interface Performance in AI Engine Kernel and Graph Programming Guide (UG1079).