Introduction to Kernels in Vitis - 2025.2 English - UG1701

Embedded Design Development Using Vitis User Guide (UG1701)

Document ID
UG1701
Release Date
2025-11-20
Version
2025.2 English

This chapter covers the different Vitis kernels, including their design flow and methods to control these kernels in the host application. The Vitis kernels can be classified into two categories:

  • AI Engine Kernels: Kernels instances in a graph are compiled as .a files, such as libadf.a.
  • PL kernels: These kernels are compiled as .xo files.

Vitis PL kernels can further be classified into two categories based on the programming languages used to develop them:

  • HLS PL kernels: These are kernels developed using the High-Level Synthesis (HLS) programming language.
  • RTL kernels: These are kernels developed using Register-Transfer Level (RTL) programming language.

Vitis PL kernels can also be categorized based on their host control mechanism:

  • Software-Controllable Kernels: These kernels provide a programmable register interface that allows host software to interact with the kernel through APIs or register reads and writes.
    • XRT managed kernels: Kernels that are controlled by XRT APIs.
    • User managed kernels: Kernels that are free-running or state machine-controlled via register writes and reads.
  • Data Driven Kernels: These types of kernels do not require host control. These kernels are present in the device but are not managed by the software application. Instead, they are triggered by the arrival of data at the interface.

In the Vitis core development kit, embedded platforms provide a foundation for designs. Targeted devices can include Versal adaptive SoCs, Zynq UltraScale+ MPSoCs, or AMD UltraScale+™ FPGAs. These devices contain a programmable logic (PL) region. A device binary (.xclbin) file can be loaded and executed on these devices, it contains and connects PL kernels compiled as object (.xo) files and can also contain AI Engine graphs.

These platforms can contain one or more interfaces to global memory (DDR or HBM), and optional streaming interfaces connected to other resources such as AI Engines and external I/O. PL kernels can access data through global memory interfaces (m_axi) or streaming interfaces (axis). The memory interfaces of PL kernels must be connected to memory interfaces of the platform. The streaming interfaces of PL kernels can be connected to any streaming interfaces of the platform, of other PL kernels, or of the AI Engine array. Both memory-based and streaming connections are defined through Vitis linking options, as described in Linking the System.

Multiple kernels (.xo) can be implemented in the PL region of the AMD device binary (.xclbin), allowing for modular and portable building blocks for a signal processing subsystem. A single kernel can also be instantiated multiple times. The number of instances of kernels and how they are connected to the subsystem is specified by a linking configuration used when building the device binary.

For Versal devices, the .xclbin file can also contain compiled AI Engine graphs. The compiled AI Engine graph (libadf.a) and PL kernels (.xo) are linked with the target platform (.xpfm) to define the hardware design. The AI Engine can be accessed by PL kernels using axis interfaces. The AI Engine can also be controlled through the Arm processor (PS) via runtime parameters (RTP) in the graph and GMIO on Versal adaptive SoC devices. Refer to AI Engine Tools and Flows User Guide (UG1076) for more information.