You can compile your AI Engine design with performance counters that can be read and collected at run time while the design is executing in hardware. These counters are programmed in the hardware to gather the following statistics for each active AI Engine in your design:
- Active Cycles – the total clock cycles that a tile has been activated
- Stall Cycles – the total clock cycles that a tile has stalled in one of four ways: memory, stream, cascade, and lock
aiecompiler –aie-heat-map
When these counters are in your design, you can turn on their capture at run time using the following code in an xrt.ini file.
[Debug]
aie_profile = true
The data can then be viewed and analyzed using the Vitis analyzer in a few different ways, including heat map, histogram, and profile summary. Analyzing this profile will help you determine the active and stall times associated with each AI Engine, and pinpoint the AI Engine whose performance might not be optimal as the design runs on hardware. The following sections include a more detailed description of the two profile views supported in the Vitis analyzer.