AI Engine Simulation-Based Profiling - 2025.2 English - UG1701

Embedded Design Development Using Vitis User Guide (UG1701)

Document ID
UG1701
Release Date
2025-11-20
Version
2025.2 English

Profiling Data Generation

In the simulation framework, the AI Engine simulator can generate a profiling report for the complete application. This report is generated using the flag –-profile.

aiesimulator –pkg-dir=Work –-profile

Text files and XML files are generated in the directory aiesimulator_output. Two types of files are generated for the tile located in column C and row R. The *_funct reports the number of calls and number of cycles for each function. The *_instr is a report that goes down to the assembly code. To visualize the report, use the Analysis View of the Vitis Unified IDE.

vitis -a aiesimulator_output/default.aierun_summary

The Profile tab opens the Profile report, which shows a menu of sections that show information.

Summary
Reports the total cycle count, total instruction count, and program size in memory.
Function Reports
Shows several key indicators of the functions in the graphs.
Number of calls
Reports the number of times the function is executed
Total function time (cycles and %)
Reports the function execution time (in cycles and as a percent). This is the time required to execute the code within a function, exclusive of any calls to its descendants.
Total function + descendant time (cycles and %)
Reports the function execution time, as well as the execution time of the descendant functions (descendant functions are functions called by the function whose profile information is being reported). The "Total Function+descendant time" represents the total time required to execute the code within a function and in any function it calls, including the time spent in its descendant functions.
Note: The time includes the time spent in the function itself as well as the time spent in all the functions it calls, directly or indirectly.
Min/Avg/Max function time (cycles)
Reports the minimum/average/maximum function execution time (in cycles and as a percent).
Min/Avg/Max function + descendant time (cycles)
Reports the minimum/average/maximum function execution time, as well as the execution time of the descendant functions (descendant functions are functions called by the function whose profile information is being reported).
Program counter Low/High
Reports the lowest and highest program counter value for a specific function.
Profile Details
Shows the assembly code, function by function, with useful precisions.

For more details on Profile Details and debugging performance issues, refer to the chapter for AI Engine Simulation Based Profiling in AI Engine Tools and Flows User Guide (UG1076).

Using Printf for Basic Debug

The simplest form of tracing is to use a formatted printf() statement in the code for printing debug messages. Visual inspection of intermediate values, addresses, etc. can help you understand the progress of program execution. No additional include files are necessary for using printf() other than standard C/C++ includes (stdio.h). You can add printf() statements to your code to be processed during simulation, or hardware emulation, and remove them or comment them out for hardware builds.

Adding printf statements to your AI Engine kernel code will increase the compiled size of the AI Engine program. Be careful that the compiled size of your kernel code does not exceed the per-AI Engine processor memory limit of 16 KB.
Important: You must use the aiesimulator --profile command to enable the printf() execution during a simulator run. If --profile is not specified, the printf() function is ignored.

A separate driver and binary is used for this functionality to allow the main simulator to remain as fast as possible. Using the debug simulator driver produces a per-tile profile report under the output directory which gives detailed cycle-level statistics of kernel execution. In addition, using the --profile option generates a run_summary file that is written to the ./aiesimulator_output folder that can be viewed as described in the AI Engine Tools and Flows User Guide (UG1076).