As discussed in Enabling Profiling in Your Application,
there are a number of --profile options that let you
enable profiling of the application and kernel events during runtime execution. This
option enables capturing profile data for data traffic between the kernel and host,
kernel stalls, the execution times of kernels and compute units (CUs), as well as
monitoring activity in Versal
AI Engines.
--profile option in v++ also requires the addition of the profile=true statement to the xrt.ini file. Refer to xrt.ini File.--profile commands can be
specified in a configuration file under the [profile]
section head using the following format, for example:
[profile]
data=all:all:all # Monitor data on all kernels and CUs
data=k1:all:all # Monitor data on all instances of kernel k1
data=k1:cu2:port3 # Specific CU master
data=k1:cu2:port3:counters # Specific CU master (counters only, no trace)
stall=all:all # Monitor stalls for all CUs of all kernels
stall=k1:cu2 # Stalls only for cu2
exec=all:all # Monitor execution times for all CUs
exec=k1:cu2 # Execution tims only for cu2
aie=all # Monitor all AIE streams
aie=DataIn1 # Monitor the specific input stream in the SDF graph
aie=M02_AXIS # Monitor specific stream interface
The various options of the command are described below:
--profile.aie <arg>
Enables profiling of AI Engine
streams in adaptive data flow (ADF) applications, where <arg> is:
<ADF_graph_argument|pin name|all>
-
<ADF_graph_argument>: Specifies an argument name from the ADF graph application. -
<pin_name>: Indicates a port on an AI Engine kernel. -
<all>: Indicates monitoring all stream connections in the ADF application.
DataIn1 input stream use the following
command:v++ --link --profile.aie:DataIn1
--profile.data <arg>
Enables monitoring of data ports through the monitor IPs. This option needs to be specified during linking.
Where <arg> is:
[<kernel_name>|all]:[<cu_name>|all]:[<interface_name>|all](:[counters|all])
-
[<kernel_name>|all]defines either a specific kernel to apply the command to. However, you can also specify the keywordallto apply the monitoring to all existing kernels, compute units, and interfaces with a single option. -
[<cu_name>|all]when<kernel_name>has been specified, you can also define a specific CU to apply the command to, or indicate that it should be applied to all CUs for the kernel. -
[<interface_name>|all]defines the specific interface on the kernel or CU to monitor for data activity, or monitor all interfaces. -
[<counters|all]is an optional argument, as it defaults toallwhen not specified. It allows you to restrict the information gathering to justcountersfor larger designs, whileallwill include the collection of actual trace information.
For example, to assign the data profile to all CUs and interfaces
of kernel k1 use the following command:
v++ --link --profile.data:k1:all:all
--profile.exec <arg>
This option records the execution times of the kernel and provides minimum port data collection during the system run. This option needs to be specified during linking.
--profile.data or --profile.stall
is specified. You can specify --profile.exec for
any CUs not covered by data or stall.The syntax for exec profiling
is:
[<kernel_name>|all]:[<cu_name>|all](:[counters|all])
For example, to profile to execution of cu2 for kernel k1 use the following
command:
v++ --link --profile.exec:k1:cu2
--profile.stall
Adds stall monitoring logic to the device binary (.xclbin) which requires the addition of stall ports
on the kernel interface. To facilitate this, the stall option must be specified during both compilation and
linking.
The syntax for stall profiling
is:
[<kernel_name>|all]:[<cu_name>|all](:[counters|all])
For example, to monitor stalls of cu2 for kernel k1 use the following
command:
v++ --compile -k k1 --profile.stall ...
v++ --link --profile.stall:k1:cu2 ...
--profile.trace_memory
When building the hardware target (-t=hw), use
this option to specify the type and amount of memory to use for capturing trace
data. You can specify the argument as follows:
<FIFO>:<size>|<MEMORY>[<n>]
This argument specifies trace buffer memory type for profiling.
- FIFO:<size>
- Specified in KB. Default is FIFO:8K. The maximum is 4G.
- Memory[<N>]
- Specifies the type and number of memory resource on the platform. Memory
resources for the target platform can be identified with the
platforminfocommand. Supported memory types include HBM, DDR, PLRAM, HP, ACP, MIG, and MC_NOC. For example,DDR[1].