Version: Vitis 2024.1
Table of Contents
Introduction
Versalâ„¢ adaptive SoCs combine programmable logic (PL), processing system (PS), and AI Engines with leading-edge memory and interfacing technologies to deliver powerful heterogeneous acceleration for any application. The hardware and software are targeted for programming and optimization by data scientists and software and hardware developers. A host of tools, software, libraries, IP, middleware, and frameworks enable Versal adaptive SoCs to support all industry-standard design flows.
This tutorial demonstrates the steps to upgrade a 32-branch digital down-conversion chain so that it is compliant with the latest tools and coding practice. Examples for the following changes with side-by-side view of the original and upgraded code are included in the tutorial.
Converting coding style from kernel functions to kernel C++ classes
Relocating global variables to kernel class data members
Handling state variables to enable x86sim
Migrating Windows (deprecated) to buffers for non-stream based kernel I/O
Replacing kernel intrinsics with equivalent AI Engine APIs
Updating older pragmas
Supporting x86 compilation and simulation
You can find the design description in the Digital Down-conversion Chain Implementation on AI Engine (XAPP1351). The codebase associated with the original design can be found in the Reference Design Files.
Upgrading Tools, Device Speed Grade, and Makefile
Note: Simply loading the latest version of the tools and compiling the design is not possible because the baseline Makefile has deprecated compiler options.
Important changes to the Makefile are listed below:
Upgrade part speed grade xcvc1902-vsva2197-1LP-e-S-es1 (previously specified by
--device
) to xcvc1902-vsva2197-2MP-e-S (specified by--platform
). As can be seen in the following table (referenced from Versal AI Core Series Data Sheet: DC and AC Switching Characteristics (DS957)), this increases the AI Engine clock frequency from 1 GHz to 1.25 GHz.Recompiling and simulating the design with this change causes the throughput to increase by around 17-25%.
Upgrade to use v++ unified compiler command.
Add support for x86 compilation and simulation.
Upgrading the Code
Converting Kernel Functions to Kernel Classes
Functionality included in the init()
function is migrated to the new kernel C++ class constructor. The main kernel function wrapper is migrated to a new class run()
member function.
Create a header file for the class. You are required to write the static void registerKernelClass()
method in the header file. Inside the registerKernelClass()
method, call the REGISTER_FUNCTION
macro. This macro is used to register the class run method to be executed on the AI Engine core to perform the kernel functionality.
When creating the kernel in the upper graph or subgraph, use kernel::create_object
instead of kernel::create
. Remove initialization_function
as it is now part of class constructor.
Migrating from Windows to Buffers
Windows I/O connections between kernels were deprecated in the 2023.2 release of the AMD Vitisâ„¢ software platform. The AI Engine Kernel and Graph Programming Guide (UG1079) describes how the source code of a design should change to upgrade it to buffer I/Os. The following figures show the steps required (repeated for every kernel) to upgrade I/O connections from Windows to buffers.
Make the changes shown in the following figure in the
kernel.cc
file:If the design uses classes, upgrade the associated header file.
In the graph file, modify the connection type and specify dimension. Note division by 4 to convert from bytes to samples.
Replacing Intrinsics with APIs
The following example shows a side-by-side comparison of intrinsic-based code compared to API-based code. Both are functionally equivalent and produce the same final hardware usage and throughput.