AMD offers a variety of tools and flows to debug designs running on the Versal AI Engine processors. They range from functional debug to debugging performance issues in the AI Engine algorithm through the various stages of development of an AI Engine application. The various tools and flows available and recommendations on when to use them are outlined below.
Functional debug of the AI Engine kernels typically involves running the x86 simulator in Vitis. This is a simulator that can be used extensively for functional debug, though care must be taken since it does not keep track of program memory size. More information around the options available and methods to run the x86 simulator can be found in this link in the AI Engine Tools and Flows User Guide (UG1076).
Functional debug of the AI Engine kernels can also be performed using the aiesimulator in Vitis. This is a cycle approximate simulator that can be used extensively for functional debug. It models the Versal device including the NoC, PS, and PL components using SystemC. You can use the built-in debugger to step through the AI Engine source code as well to further debug design functionality. printf capability is also available, though it might impact the program memory size, which is tracked by the aiesimulator. The Vitis IDE has a debug view which displays registers, variables, available breakpoints, variables to register/memory mapping, internal/external memory contents, disassemble view for instruction, and an instruction pipeline (pipeline view) for a single AI Engine kernel.
Because the aiesimulator is also cycle approximate, it is possible to do extensive performance debug of the design. The simulator offers profiling and event trace capability, which can be used to analyze stalls and root cause performance issues in the design.
In addition, the aiesimulator also has options that can be used to detect and report any out of bounds access violations in the kernel code. More information around the options available and methods to run the aiesimulator can be found in this link in the AI Engine Tools and Flows User Guide (UG1076).
When the AI Engine kernel is integrated into the rest of the Versal device using a Versal platform via V++, you can run the hardware emulator. This emulator is cycle accurate and can simulate PL logic either using RTL or SystemC simulation models. This simulation step should be performed after the v++ link step, with the “hw_emu” target specified to the v++ link command line. It is also possible to debug the AI Engine kernel using the debugger and stepping through both the kernel code and PS host code. This step uses QEMU to simulate the PS and a SystemC simulation model of the NoC. More information around the options available and methods to run the hardware emulator can be found in the following:
- See this link in the AI Engine Tools and Flows User Guide (UG1076)
- See this link in the AI Engine Tools and Flows User Guide (UG1076)
When you are ready to take the design to hardware, you have a variety of debugging and profiling tools at your disposal. You can obtain profiling data when you run your design in hardware using run time event APIs in your PS host code, or using performance counters built into the hardware using a compile time option. Analyzing this data helps you gauge the efficiency of the kernels, the stall and active times associated with each AI Engine, and pinpoint the AI Engine kernel whose performance might not be optimal. This also allows you to collect data on design latency, throughput, and bandwidth. Detailed heatmaps and histograms of the profile are available in the Vitis Analyzer tool. More information around the options available and methods to analyze design performance in hardware can be found in this link in the AI Engine Tools and Flows User Guide (UG1076).
You can also debug the AI Engine kernel source in hardware. This can help debug functional issues in hardware. More information on running the debugger in Vitis IDE and the various features associated with it can be found in this link in the AI Engine Tools and Flows User Guide (UG1076).