Deadlock Detection - 2023.2 English

Vitis Tutorials: AI Engine (XD100)

Document ID
XD100
Release Date
2024-03-05
Version
2023.2 English

AI Engine designs can run into simulator hangs. A common cause is insufficient input data for the requested number of graph iterations, a mismatch between production and consumption of stream data, cyclic dependency with stream, cascade stream or asynchronous buffers, or the wrong order of blocking protocol calls (acquisition of async buffers, read/write from streams).

This topic walks you through the practical scenario of deadlock during aiesimulation and the different the simulator options that help debugging.

  1. Open the src/kernels/data_shuffle.cc, and comment out line 24.

  2. Compile the design by rebuilding the [aie_component] under AIE SIMULATOR/HARDWARE.

  3. Run the aiesimulation by selecting AIE SIMULATOR/HARDWARERun, and observe the hang.

  4. You can wait for few seconds to confirm the hang, and click the icon located in the bottom right corner that shows the background operations in progress view, and the kill the simulation process. simulation hang

  5. The AI Engine simulator provides an option to exit the simulation if all active cores in the stalled state after the time period (in ns).

  6. For example, add --hang-detect-time=60 in the Run configurations -> Additional Arguments, and rerun the aiesimulation. You can observe the simulation exits smoothly with the following information in the console.

    Enabling core(s) of graph mygraph
    WARNING: All the cores are in stalled state at T=636000.000000 ps for a period of 60ns
    |---------------- Core Stall Status ----------------|
    (24,1) -> Lock stall ->  Lock_East detected at T=571600.000000 ps
    (25,1) -> Lock stall ->  Lock_East detected at T=575600.000000 ps
    (25,2) -> Lock stall ->  Lock_South detected at T=574000.000000 ps
    |---------------------------------------------------|
    WARNING: This simulation is running with hang detection time of 60ns, to modify the hang detect time please rerun simulation with -hang-detect-time=<value in NS> option 
    Exiting!
    
  7. Revert the changes in the source file to exercise other debug features.

More information about how to visualize the deadlock using the stream stalls and lock stalls in the Vitis Analyzer is explained in Visualizing Deadlock in the Vitis Analyzer.

NOTE:

  1. The hang detect time should be selected approximately based on the complexity of the kernels, graph, and also on the number of iterations the graph is running.

  2. If the simulator option, --simulation-cycle-timeout=cycles, is also specified, care should be taken that the --hang-detect-time should be less than the the timeout. Also note that the timeout is in cycles.

  3. To convert the cycles to the ns, you need to consider the AI Engine clock frequency.