You can identify if a stream stall needs to be analyzed from the Performance Metrics view. This view also shows the tile/tiles that are causing the stall.
The following steps illustrate how a stream stall can be analyzed starting in the Performance Metrics tab in the Vitis IDE.
- In the Performance
Metrics view, select Stream
Stall Time (%) to view stream stalls across all tiles. Identify
the tile(s) to be analyzed.Note: Objects in Performance Metrics view can be cross-probed with Trace view, Graph view, and Array view. For example, selecting the tile in the Performance Metrics view to highlight the tile in Trace view can help quickly locate the tile.
- Select the Trace
view.
- Select the Stream Stalls view. In the
Stalls view,
stream stalls display the following information. You can cross-probe objects in
blue with other views by clicking on it.
- NAME
- The name of the stream stall is
SS_<NUM>. The earlier the stall happens, the smaller the number. The number is unique across all types of stalls. - Stalled Tile
- The AI Engine tile that contains the stalled kernel.
- Stalled Kernel
- The kernel that is stalled. It is named
<Kernel_function_name>.<Schedule_ID>.<Graph_instance_name>. If it displays as_main, you need to cross-probe to find the real kernel function. - Start (ps)
- The start time of the stall.
- Duration (ps)
- The duration of the stall.
- PC
- Program counter when the stall happens.
- Stalled Port
- The port of the stalled kernel.
- Related Stalls
- Other stalls that can cause the stall.
- Full Destination
- The port that the stalled kernel cannot write into because it is full.
- Empty Source
- The port that the stalled kernel cannot read from because it is empty.
- Click a stream stall in Stalls view to go to the start of the stall in the Trace view. Right-click the stall and select Filter Trace as needed. After filtering trace, the Trace view shows the signals related to the stall. The tool hides non-related signals. Exploring the trace using filter trace is clearer when the design is large.
- You can click and cross-probe objects in blue in the Stalls view. For example, if you click the kernel in Stalls view, it highlights the kernel in Trace view.
- Zoom in and out of the Trace view to explore the stalls. You can gain insight
into why a stall occurs by analyzing the following items:
- The position of the stall
- The frequency of similar stalls
- Events before the stall and related stalls (if any)
- To clear the previously filtered trace, right-click and select Clear All Filters.
- It is helpful to have an overview of the stall path in Graph view. Select Graph view and then select Tile View from the drop-down list.
- Select Stalls view and then select Stream Stalls from the drop-down list.
- Explore the stream stalls in the Stalls view. Select a stream stall in the
Stalls view to display an overview
of the stall in the graph. The red path shows where the stall occurs. The
occurrence can be from a stalled kernel to full destination port, or from an
empty source port to the stalled kernel.Tip: If a stream multicasts to multiple destinations and a stream stall occurs when the stream does not have enough FIFO for all destinations, the highlighted stalled kernel and stalled net cannot be connected (separately in red). This means that you must analyze all destinations of the multicast stream as a whole for the stream stall. The following figure shows an example of multicast stream stall in the Graph view.
- You can open graph source code or kernel source code from the
Graph view or
Array view. Select
the kernel instance by clicking the kernel object in the Stalls view or clicking the kernel in
the Graph view.
- Right-click the kernel instance in the Graph view, and select either Goto Graph Source or Goto Kernel Source. This selection opens either the graph or kernel source code.
- Correlate the graph source code and kernel source code with the stalls analyzed. Edit the source code as needed.
The following table lists some possible scenarios that can cause a stream stall. The table also provides possible solutions.
| Source | Destination | Stall Type | Possible Solution | Notes |
|---|---|---|---|---|
| Stream | Stream | Stream stall |
|
|
| Stream | Multiple streams | Stream stall |
|
Multicast |
| Stream | Multiple streams of multiple kernels in same AI Engine | Stream stall |
|
Multicast |
| Multiple streams | Multiple streams | Stream stall |
|
|
| PLIO | Stream | Stream stall |
|
|
| Stream | PLIO | Stream stall | Same as above. | |
| Stream (32 bits per iteration) | PLIO | Stream stall | Send TLAST for each 32 bits. See AI Engine-PL Interface Performance in AI Engine Kernel and Graph Programming Guide (UG1079). |