Embedded designs that include an AI Engine as an accelerator contain a dataflow graph for the AI Engine, which performs a specific algorithm. The dataflow graph is compiled using aiecompiler and the output binary is run on hardware. The dataflow graph is verified using one of the following methods:
- aiesimulator
- Uses a stimulus generator and checker.
- Hardware emulation flow
- Includes other system-level IP, such as CIPS, NoC, DDR memory, and RTL IP.
After verification, the kernels are expected to work on hardware without any further debugging on hardware required. However, there might be functionality and performance differences on hardware compared to the simulation environment due to the difference in timing in hardware or the differences between the simulation model and the hardware behavior (for example, AI Engine simulator is not cycle accurate with AI Engine hardware).
You can enable event trace profiling using Adaptive Data Flow (ADF) graphs that generate PC event trace or execution trace. You can view event tracing data in the Vitis Analyzer to check for specific stalls while running the kernel. You can also use event trace to check for specific degradation in the performance of the stream switch networks due to a stall on incoming or outgoing streams.