Run on Hardware - Run on Hardware - 2025.2 English - XD100

Vitis Tutorials: AI Engine Development (XD100)

Document ID
XD100
Release Date
2026-03-27
Version
2025.2 English

The packaging step creates a zip version of the SD card image that you can use with any usual SD Card flash software like balenaEtcher. When you run the application on hardware using XRT on Linux, capture the trace using the following xrt.ini file specification:

# Debug group for the aie, ps and pl
[Debug]
aie_profile = false
aie_trace = true
device_trace=fine
continuous_trace = true
host_trace=true

# PL Trace buffer
trace_buffer_size = 32M
trace_buffer_offload_interval_ms = 5

# Subsection for AIE profile settings only if aie_profile is set to true
[AIE_profile_settings]
# Interval in between reading counters (in us)
interval_us = 1000

tile_based_aie_metrics = all:heat_map
tile_based_aie_memory_metrics = all:conflicts
tile_based_interface_tile_metrics = all:output_throughputs


# Subsection for AIE Trace only if aie_trace is set to true
[AIE_trace_settings]
# PLIO
reuse_buffer = true
periodic_offload = true
buffer_offload_interval_us = 50
buffer_size = 100M

tile_based_aie_tile_metrics = all:functions
enable_system_timeline = true

[Runtime]
verbosity = 10

The option enable-system-timelineis true by default. Find more information at xrt.ini file in UG1702.

Follow these steps to boot the board, configure the environment, and run the application to capture trace data:

  1. Plug the SD card in your board

  2. Connect the right COM port.

  3. Boot the board.

  4. Login with username petalinux and set the password to whatever you want, let say p.

  5. For the next steps you must be a superuser: sudo su and enter your password.

  6. Change root password: passwd root to use r as the password.

  7. As you must copy back the trace files, allow connection with root through ethernet: vi /etc/ssh/sshd_config.

  8. Change the option of PermitRootLogin into yes.

  9. Now, go to the application directory: cd /run/media/mmcblk0p1.

  10. To run the application multiple times with different options, use the script newdir which copies the necessary files into directory ptest1, ptest2, and more.

  11. In ptest1, check the content of xrt.ini and embedded_exec.sh.

  12. Run the application: ./embedded_exec.sh

    All trace files generate in 2 s. To perform another test with other parameters, run ./newdir from /run/media/mmcblk0p1 and change the parameters in ptest2/xrt.ini. Type reboot to restart the board and re-run the application.

After running the application with multiple sets of parameters, copy the various ptest directories back to your development machine using scp. Use ifconfig to get the board IP address.

You can copy the whole ptest*directories to ProfileData on your development machine. The minimum set of files that you have to copy is: *.csv, *.txt, *.bin, *summary

MissingImage

Now you can run vitis_analyzeron your development machine with the summary file: vitis_analyzer xrt.run_summary. To view the System Timeline, enable it on the tool by clicking Vitis -> New Feature Preview in the top bar menu and checking System Timeline.

Click the analysis tab on the Vitis Analyzer, and click Timeline Trace.

The overall view covers the complete simulation time:

Missing image

In the beginning, you can see when the processing system opens the device and starts the PL kernels:

Missing images

Zooming in where the AI Engine graph starts, you can see the PL kernels gen2s generating the data and the AI Engine kernels consuming these data. These traces align well enough to understand the overall behavior of the system.

Missing image

The polling interval is crucial in event alignment. Reducing the polling interval improves event alignment in the timeline at the expense of increased timestamp file size. To show the effect of different polling intervals, modify the buffer_offload_interval_us parameter in xrt.ini file. The default value is 50 µs. The following example shows 100 µs:

Missing image

As you can see , the PL kernels are out of sync with the AI Engine array iterations.