DSP Library - 2024.1 English

Vitis Tutorials: AI Engine

Document ID
XD100
Release Date
2024-06-19
Version
2024.1 English

Version: Vitis 2024.1

Introduction

Versal™ adaptive SoCs combine programmable logic (PL), processing system (PS), and AI Engines with leading-edge memory and interfacing technologies to deliver powerful heterogeneous acceleration for any application. The hardware and software are targeted for programming and optimization by data scientists and software and hardware developers. A host of tools, software, libraries, IP, middleware, and frameworks enable Versal adaptive SoCs to support all industry-standard design flows.

This tutorial demonstrates how to use kernels provided by the DSP Library for a filtering application, how to analyze the design results, and how to use filter parameters to optimize the design’s performance using simulation. It does not take the design to a hardware implementation, however.

IMPORTANT: Before beginning the tutorial, make sure that you have read and followed the Vitis Software Platform Release Notes (v2024.1) for setting up the software and installing the VCK190 base platform.

Before starting this tutorial, run the following steps.

  1. Set up your platform by running the xilinx-versal-common-v2024.1/environment-setup-cortexa72-cortexa53-xilinx-linux script as provided in the platform download. This script sets up the SYSROOT and CXX variables. If the script is not present, you must run xilinx-versal-common-v2024.1/sdk.sh.

  2. Set up your ROOTFS to point to the xilinx-versal-common-v2024.1/rootfs.ext4.

  3. Set up your IMAGE to point to xilinx-versal-common-v2024.1/Image.

  4. Set up your PLATFORM_REPO_PATHS environment variable based upon where you downloaded the platform.

Table of Contents

Objectives

After completing the tutorial, you should be able to:

  • Build signal processing datapath using the AMD Vitis™ application acceleration development flow

  • Evaluate the performance and resource utilization metrics of a design

  • Adjust filter parameters to meet system performance requirements

Tutorial Overview

This tutorial shows how to construct a simple two-stage decimation filter. This filter is not targeted at a specific real-life application, but is used to show how to use the DSP Library to construct filter chains.

  • Part 1 shows how to use create an AI Engine project and instantiate a parameterized FIR filter from DSPLib

  • Part 2 shows how to cascade filters together into a chain

  • Part 3 shows how to optimize performance of the filter chain by tuning individual filters

Part 1: Creating a Single Kernel Graph

Part 1 of this tutorial will:

  • Shows a basic Makefile for AI Engine and X86 compilation, simulation, and visualization.

  • Link in the DSPLib functions.

  • Create a simple graph containing a parameterized DSPLib FIR filter.

  • Compile and simulate the design.

  • Evaluate the results.

Understanding the Source Files

To begin this tutorial, go to the part 1 directory:

cd part_1

List the files available in aie/src:

ls aie/src
fir_graph.h  system_settings.h  test.cpp

The system_settings.h files is a standard header file that defines the constants used in this project. It includes the header file “<adf.h>”. This is the Adaptive Data Flow (ADF) header file, which provides the classes used for specifying graphs. It also includes the FIR Filter kernel’s header file, fir_sr_sym_graph.hpp.

The design itself is implemented in fir_graph.h. A graph is used to define elements and the connections between them that make up the design. Some of the key aspects of this file are mentioned below.

using namespace adf
namespace dsplib = xf::dsp::aie;

This simplifies accessing the ADF and DSPLib classes.

The FIR filter taps are declared as a vector, and initialized:

	std::vector<int16> chan_taps = std::vector<int16>{
		 -17,      -65,      -35,       34,      -13,       -6,       18,      -22,
         .... };

The following line instantiates the DSPLib FIR filter kernel, named chan_FIR (channel filter):

	dsplib::fir::sr_sym::fir_sr_sym_graph<DATA_TYPE, COEFF_TYPE, FIR_LEN_CHAN, SHIFT_CHAN, ROUND_MODE_CHAN, WINDOW_SIZE, AIES_CHAN> chan_FIR;

The filter’s template parameters and their meanings can be found in UG1295.

	port<input> in;
	port<output> out;

Specifies the input and output ports for this graph.

	connect<>(in, chan_FIR.in[0]);
	connect<>(chan_FIR.out[0], out);

These statements connect our graph’s input and outputs to the FIR filter’s input and outputs, respectively.

	location<kernel>(chan_FIR.m_firKernels[0]) = tile(18,0);

This statement specifies a location attribute for the filter kernel. It specifies the X/Y location of the AI Engine tile within the AI Engine array in which to place the kernel. Location placements for kernels are optional, but shown here to illustrate how physical constraints can be incorporated into the source code. The results of this statement are seen later when viewing the compilation results.

There is a second graph that instantiates this FIRGraph and connects it to the PLIO elements, which are points at which data can be moved onto and off of the AI Engine array:

class TopGraph : public graph
	{
	public:
		input_plio in;
		output_plio out;

		FirGraph F;

		TopGraph()
		{
			in = input_plio::create("128 bits read in",  adf::plio_128_bits,"data/input_128b.txt",  250);
			out = output_plio::create("128 bits read out", adf::plio_128_bits,"data/output_128b.txt", 250);

			connect<> (in.out[0],F.in);
			connect<> (F.out, out.in[0]);
		};
	};

The third file, test.cpp, can be considered the testbench component. It is not intended for hardware implementation, but rather to drive the simulation. The main function is specified, which runs the simulation.

int main(void) {
    filter.init() ;
    filter.run(NUM_ITER) ;
    filter.end() ;
    return 0 ;
}

Compile the application

The Makefile is simple. It allows the user to compile for two different targets x86 and hw, visualizes the compiler output in vitis_analyzer, runs an AI Engine or X86 simulation, and visualizes also the output in vitis_analyzer.

Currently, the Makefile does not specify where the DSP Lib includes are located. The first step of this tutorial consists of adding the following lines, on line 27:

#########  Add DSP include files location   #########
DSP_FLAGS := --include=$(DSPLIB_ROOT)/L1/src/aie
DSP_FLAGS += --include=$(DSPLIB_ROOT)/L1/include/aie
DSP_FLAGS += --include=$(DSPLIB_ROOT)/L2/include/aie

Type make aie to run the following command:

aiecompiler -target=hw $(AIE_FLAGS) $(DSP_FLAGS) $(AIE_GRAPH_FILES)

This compiles the design and maps it to the AI Engine Tiles.

Visualizing the compilation results is performed by typing make compviz, which runs the following command:

	vitis_analyzer $(AIE_OUT_DIR)/test.aiecompile_summary

After vitis_analyzer opens, it displays the Summary page, which provides a brief summary of the project.

Vitis Analyzer Summary

Selecting Graph on the navigation bar shows a diagram of the filter implementation. It illustrates the data connectivity points into and out of the graph (128-bit interfaces), and the symmetrical FIR filter kernel being implemented on a single tile with ping-pong buffers on either side of it.

Vitis Analyzer Graph

Selecting Array on the navigation bar shows the physical implementation of the design on the AI Engine array. Here you can see the PLIO interfaces in pink, the AI Engine tile that implements the kernel in blue, and the ping-pong buffers in purple. Note the kernel is located in tile (18,0), which was specified in fir_graph.h. Clicking on the components on the diagram takes you to the appropriate tab below, which provides a description of the element. Conversely, you can select the various element tabs (Kernels / I/O / Buffers / Ports / Nets / Tiles / Interface Channels) and click on a component to see where it is located on the array.