Vitis Tutorials: AI Engine (XD100) - 2023.2 English - Learn how to target, develop, and deploy advanced algorithms using a Versal AI Engine array in conjunction with PL IP/kernels and software applications running on the embedded processors. - XD100
Document ID
XD100
Release Date
2024-03-05
Version
2023.2 English
Vitis Tutorials: AI Engine
AI Engine Development on AIE-ML
AI Engine Development on AIE-ML
AI Engine for Machine Learning Development
Feature Tutorials
A to Z Bare-metal Flow
Introduction
Support
Using GMIO with AIE
Introduction
Objectives
Steps
Runtime Parameter Reconfiguration
Introduction
Objectives
Steps
Support
Packet Switching
Objectives
Steps
Support
Versal Integration for Hardware Emulation and Hardware
Introduction
Objectives
Tutorial Overview
Section 1: Compile AI Engine Code using the AI Engine Compiler for x86simulator, Viewing Compilation Results in Vitis Analyzer
Compiling an AI Engine ADF Graph for V++ Flow
Vitis Analyzer Compile Summary
Section 2: Simulate the AI Engine Graph using the x86simulator
Section 3: Compile and Run Software Emulation
1. Compiling HLS Kernels using v++
2. Use V++ to Link AI Engine and HLS Kernels with the Platform
3. Compile the A72 Host Application
4. Package the Design
5. Run Software Emulation
Section 4: Compile AI Engine Code for AIE Simulator: Viewing Compilation Results in Vitis Analyzer
Important
Compiling an AI Engine ADF Graph for V++ Flow
Vitis Analyzer Compile Summary
Section 5: Simulate the AI Engine Graph using the aiesimulator and Viewing Trace and Profile Results in Vitis Analyzer
Section 6: Run the Hardware Emulation, and View Run Summary in Vitis Analyzer
1. Compiling HLS Kernels Using v++
2. Use V++ to Link AI Engine, HLS Kernels with the Platform
3.Compile the A72 Host Application
4.Package the Design
5.Run Hardware Emulation
Section 7: Build and Run on Hardware
Summary
Support
AIE Compiler Features
Introduction
Objectives
Tutorial Sections
Conditional Objects
Case 1
Case 2
Case 3
Case 4
Multirate
UpConv then DownConv (Buffer)
DownConv then UpConv (Buffer)
Split and Merge (Buffer)
UpConv then DownConv (Stream)
DownConv then UpConv (Stream)
Split and Merge (stream)
Multicast
Case 1: Stream and Buffer Multicasting
Case 2: Multirate Buffer Multicasting
Design Tutorials
Versal Custom Thin Platform Extensible System
AI Engine Datamover Examples
Object oriented C++ template class kernels with AI Engine API style
Description of the AIE kernels - C function style kernels (unused in graph)
Short description of vector datamover optimizations
Special note on preparing the loop
Compile and build AIE design
Simulating and checking the AIE kernels
References
AI Engine Documentation
Vitis Unified Software Development Platform 2021.2 Documentation
Revision History
AI Engine-ML Programming
AI Engine-ML Architecture
Introduction
AI Engine-ML processor array
Support
Tiling Parameters Programming
Introduction
Tiling parameter structure
A graphical Example
Some other examples
1D Linear with Zero-Padding before
1D linear with zero-padding and truncation
3D Linear with zero padding around
Support
Compute Optimization
Performance Table
Support
AI Engine Development
Feature Tutorials
AI Engine A-to-Z Flow for Linux
AI Engine GMIO Performance Profile
Prerequisites
Setting up the environment
Re-compiling ADF graph
Re-compiling Programmable Logic (PL) kernels targeting the custom platform
Hardware Emulation
Targeting Hardware
Support
A to Z Bare-metal Flow
Custom Base Platform Creation
Platforms
Step 1: Build the AMD Versalâ„¢ Extensible Embedded Platform Example Design in Vivado
Step 2: Build the Platform in the Vitis Software Platform
AIE Application Creation
Step 1: Create a new AI Engine Application Project
Step 2: Build the Project and Run Through Emulation-AIE
PL Application Creation
Step 1: Modify the Graph for Use in Hardware Build
Step 2: Add PL Kernels
Step 3: Configure the Hardware Linking Project
Step 4. Build the System
PS Application Creation
Step 1: Create a New Platform in the Bare-metal Domain
Step 2. Build the Baremetal AI Engine Control Application
Step 3: Package the Full System
Step 4: Run the System in Hardware Emulation
Step 5: Build the System targeting the Hardware
Step 6: Run the System in Hardware
Summary
Using GMIO with AIE
AI Engine GMIO Performance Profile
Design Introduction
Performance Profiling Methods
Profiling using C++ Class API
Profiling using AI Engine Cycles Received from AI Engine Kernels
Profiling using the Event API
Conclusion
AI Engine GMIO Programming Model
Step 1 - Synchronous GMIO Transfer
Run AI Engine Compiler and AI Engine Simulator
Step 2 - Asynchronous GMIO Transfer for Input and Synchronous GMIO Transfer for Output
Run AI Engine Compiler and AI Engine Simulator
Step 3 - Asynchronous GMIO Transfer and Hardware Flow
Run AI Engine Simulator and Hardware Flow
Conclusion
Runtime Parameter Reconfiguration
Introduction
Overview
Steps
Asynchronous Scalar RTP
Asynchronous Array RTP
Asynchronous RTP Read
Synchronous RTP
Summary
Support
Packet Switching
Packet Stream Based AI Engine Kernels
Packet Stream Interfaces and Operations
Construct Graph for Packet Stream Kernels
Run the AI Engine Simulator, HW Emulation, and HW Flows
Conclusion
Support
Buffer-Based AI Engine Kernels
Construct Graph with Packet Switching Capability
Packet Format
Prepare Data and Run AI Engine Simulator
Example PL Kernels for Packet Switching
Example PS code for Packet Switching
Run Hardware Emulation and Hardware Flows
Conclusion
Buffer-Based AI Engine Kernels with Mixed Data Types
Prepare Data for AI Engine Simulator
PS Application and HW Emulation Flows
Conclusion
Support
Versal Integration for Hardware Emulation and Hardware
Versal Integration for Hardware Emulation and Hardware
Introduction
Objectives
Tutorial Overview
Section 1: Compile AI Engine Code using the AI Engine Compiler for x86simulator, Viewing Compilation Results in Vitis Analyzer
Compiling an AI Engine ADF Graph for V++ Flow
Vitis Analyzer Compile Summary
Section 2: Simulate the AI Engine Graph using the x86simulator
Section 3: Compile and Run Software Emulation
1. Compiling HLS Kernels using v++
2. Use V++ to Link AI Engine and HLS Kernels with the Platform
3. Compile the A72 Host Application
4. Package the Design
5. Run Software Emulation
Section 4: Compile AI Engine Code for AIE Simulator: Viewing Compilation Results in Vitis Analyzer
Important
Compiling an AI Engine ADF Graph for V++ Flow
Vitis Analyzer Compile Summary
Section 5: Simulate the AI Engine Graph using the aiesimulator and Viewing Trace and Profile Results in Vitis Analyzer
Section 6: Run the Hardware Emulation, and View Run Summary in Vitis Analyzer
1. Compiling HLS Kernels Using v++
2. Use V++ to Link AI Engine, HLS Kernels with the Platform
3.Compile the A72 Host Application
4.Package the Design
5.Run Hardware Emulation
Section 7: Build and Run on Hardware
Summary
Support
Versal System Design Clocking
Introduction
Objectives
Step 1 - Building ADF Graph
Step 2 - Clocking the PL Kernels
Step 3 - v++ linker – Building the System
Step 4 - Compiling Host Code
Step 5 - Packaging Design and Running on Board
Challenge (Optional)
Build the design for Hardware Emulation
Summary
Using Floating-Point in the AI Engine
Introduction
AI Engine Architecture Details
Fixed-Point Pipeline
Floating-point Pipeline
Floating-point intrinsics
Start, offset
fpneg, fpabs, fpadd, fpsub
fpneg
fpabs
fpneg_abs
fpadd, fpsub
fpadd_abs, fpsub_abs
fpmul
fpabs_mul
fpneg_mul
fpneg_abs_mul
fpmac, fpmsc, fpmac_abs, fpmsc_abs
fpmul_conf, fpmac_conf
Floating-Point Examples
FIR Filter
Real Floating-Point Filter
Complex Floating-Point Filter
Matrix Multiply
Support
DSP Library
Introduction
Part 1: Creating a Single Kernel Graph
Understanding the Source Files
Compile the application
Running the Design through Simulation
Using Vitis Analyzer to look at the Simulation Results
Part 2: Creating a Multi Kernel Graph
Changes to the Filter Graph from Part 1
Build AI Engine Emulation
Running the Design through Simulation
Using Vitis Analyzer to look at the Compilation and Simulation Results
Part 3: Optimizing Filter Performance
Changes to the Filter Graph from Part 1
Build AI Engine Emulation
Running the Design through Simulation
Using Vitis Analyzer to look at the Compilation and Simulation Results
Conclusion
Debug Walkthrough
Porting a Command Line Project to the Vitis IDE Project
Step 1: Launch Vitis Unified IDE
Step 2: Create an AI Engine Component
Step 3: Create HLS Components
Step 4: Create the Application Component
Step 5: Create the System Project
Support
X86 Simulation Debug Walkthrough
X86 Simulation Debug Walkthrough
Introduction
Features
Section 1
Build and Simulate in the Vitis IDE
Section 2
Debug Using printf()
Section 3
Debug Using printf with Vector Datatypes
Section 4
Debug Using the Vitis IDE Debugger
Section 5
x86simulator Options for Debugging
Data Dump
Deadlock Detection
Scenario 1
Scenario 2
Trace Report in the File
Trace Report in the Output Console
Section 6
Memory Access Violation and Valgrind Support
Set Up the Environment Variables
Section 6 Exercise Step
Section 7
Using the GDB Debugger in the Command Line
x86simulation on the Command Line
x86simulation with the GDB
x86simulator Using the GDB Server
Section 7 Exercise Step
Support
AI Engine Simulation Debug Walkthrough
AI Engine Simulation Debug Walkthrough
Introduction
Features
Section 1
Build and Simulate in the Vitis IDE
Section 2
Debug Using printf
Section 3
Debug Using the Vitis IDE Debugger
Limitations
Section 4
Enabling Profile and Trace Options
Exercise Step
Section 5
Deadlock Detection
Section 6
Visualizing Deadlock in the Vitis Analyzer
Section 7
Debugging Memory Access Violations
Section 8
Kernel Debug
Section 9
Design Performance Debug
Calculating the Graph Throughput Using Graph Output
Support
Software-Emulation Debug Walkthrough
Software-Emulation Debug Walkthrough
Introduction
Features
Section 1
Build for Software Emulation Using the Vitis IDE
Section 2
Debug using the Vitis IDE Debugger for Software Emulation
Support
Hardware-Emulation Debug Walkthrough
Hardware-Emulation Debug Walkthrough
Introduction
Features
Section 1
Build for Hardware Emulation Using the Vitis IDE
Section 2
Debug PL Kernels Using the Vivado Logic Simulator
Section 3
Performance of the AI Engine Using the Hardware Emulation Results
Calculating the Kernel Latency
Calculating the Graph Throughput Using the Graph Output
Section 4
Command Line Project Source Code Debug with the Vitis IDE (classic)
Support
Hardware Debug Walkthrough
Design Execution and System Metrics
Features
Running the Design on Hardware
Analyzing Run Results
AI Engine Status Using XRT
Manual AI Engine Status Using the XBUtil Utility
Deadlock Detection Using XSDB
Error Handling and Reporting in the Host Application
XRT Error Handling APIs
Using XBUtil
Using APIs in the Host Application
Profiling Graph Throughput
Exercise Step
Profiling to Count Samples Sent and Received
Support
System Profiling
Features
Generating the Hardware Image
Hardware Profiing Features
XRT Flow
Open Multiple Profile Runs in the Vitis Analyzer
Profiling Data Explanation
AI Engine Core Profiling Data**
AI Engine Memory Profiling Data
Interface Profiling Data
Profiling Data Analysis
XSDB Flow
Support
PL Kernel Analysis
Features
Getting the Design Files Ready
Profiling Using PL Profile Monitors
Inserting ILAs to Monitor Specific AXI Interfaces
Enable ILA in the Design
Set Up the Connection in Vivado
Examine the Captured Results
Support
AI Engine Event Trace and Analysis
Event Trace Analysis Features
Build the Design
Prepare for the Hardware Run
XRT Flow
Launch the Vitis Analyzer to Examine the Event Trace Files
Details of the Event Trace Data
XSDB Flow
Event Trace Considerations
Event Trace Choice Considerations
Number of Event Trace Streams Methodology
Event Trace Limitations
Debug the Host Code and Kernel Source Code using the Vitis IDE
Limitations of the Source Code Debug on Hardware
Support
AI Engine DSP Library and Model Composer
Introduction
Before You Begin
Overview
Stage 1: Create and Simulate the Design
Stage 2: Further Analysis of the Design
Stage 3: Generate the Code and Perform Emulation-AI Engine
Stage 4: Increasing the PLIO Bitwidth and Re-generate
Conclusion
Versal Emulation Waveform Analysis
Introduction
Objectives
Tutorial Overview
Design Overview
Transaction Level Modeling
Steps
Step 1: Build Design
Step 2: Launching Emulation with XSIM Waveform GUI
Step 3: Using XSIM Waveform GUI and QEMU
Exploring the Waveforms
Checking Proper Boot-up Using PMC
Transactions Generated by PS (QEMU) to PL/AIE
PL to AI Engine
AI Engine RTP Signals
AI Engine to PL to DDR Memory
Limitations
Step 4: Using Vitis Analyzer
Summary
AXIS External Traffic Generator
Introduction
Objectives
Prerequisites
Tutorial Overview
Directory Structure
Before You Begin
Documentation: Explore AI Engine Architecture
Tools: Installing the Tools
Environment: Setting Up Your Target Platform Environment
Validation: Python Environment
Other Tutorials: Learn Basic Vitis Compiler and AI Engine Concepts
System View
Connecting the AXI Traffic XOs
Understanding Python
Creating a Sine Wave Data Vector
Convert Numpy to Byte Array
Generate AXI Transactions
Receive AI Engine Array Output
Convert Byte Array to Numpy
Plot the results
Running Hardware Emulation
AI Engine and Versal Integration
Section 1: Compile Kernels and AI Engine Graph
Compiling the kernel Files Using v++
Compiling an AI Engine ADF Graph for V++ Flow
Connecting the traffic generators with V++
Section 3: Compile the A72 Host Application
Section 4: Package the Design
Section 5: Run Hardware Emulation
Summary
Support
AI Engine Performance and Deadlock Analysis Tutorial
AI Engine Graph Execution and Measurement
Graph and Kernel Code
Graph Execution Model
Graph Performance Measurement
Design Optimization Considerations
Conclusion
Support
AI Engine Deadlock Analysis
Common Deadlock Scenarios
AI Engine Deadlock Example and Analysis in AI Engine Simulator
AI Engine Stall Analysis with Vitis Analyzer
AI Engine Deadlock Detection in the Hardware Emulation Flow
AI Engine Deadlock Detection in the Hardware Flow
Conclusion
Appendix (Optional)
Manual Dump and Register Reading to Detect AI Engine Status in Hardware Emulation and Hardware
Support
AI Engine Status Analysis
Setting Up and Running the Design
Option 1: Automated and Periodic AI Engine Status Output
Analyzing the Automated Status Output
Option 2: Manual output the AI Engine status
Analyzing the Manual Status Output
Conclusion
Support
Implementing an IIR Filter on the AI Engine
Part1a
Implementing an IIR Filter on the AI Engine - Part 1a
Preliminaries
Kernel Code
Julia Script Notes
Adaptive Dataflow Graph
Testbench Code
Build and Run the Program
Conclusion
References
Support
Part1b
Recap
Julia Script
Adaptive Dataflow (ADF) Graph
Testbench Code
Building and Running the Design
Changing Coefficients During Runtime
Conclusion
Support
Part2a
Implementing an IIR Filter on the AI Engine - Part 2a
Preliminaries
Kernel Code
Testbench Code
Analysis
Conclusion
Support
Part2b
Implementing an IIR Filter on the AI Engine - Part 2b
Preliminaries
Kernel Header
Kernel Code (AI Engine API)
Graph Code
Testbench Code
Analysis (using AI Engine API)
Generated Code
Throughput
Kernel Code (LLI)
Conclusions
Support
Post-Link Recompile of an AI Engine Application
Lab 1: Direct AI Engine Recompile Makefile Flow
Initialization
Phase 1: Compile AI Engine application and PL Kernels and Link the System
Phase 2: Recompile the AI Engine Application, Package the New System, and Rerun Hardware Emulation
Perform On-Board Testing
Support
License
Lab 2: Vitis Makefile Flow
Initialization
Phase 1: Creating a Fixed Platform from an AI Engine Application and PL Kernels
Phase 2: Using a Platform Generated by Vitis and Modifying the AI Engine Application
Perform On-Board Testing
Support
License
Python and C++ External Traffic Generators for AI Engine Simulation and Emulation Flows
Run Traffic Generators with AI Engine Simulation
Step-1: ADF Graph Modifications
Step-2: Writing the traffic generator
Matlab
1. Instantiating the XTLM Utilies
2. Transmitting the data using send_data () API
3. Receiving the data using receive_data_with_size () API
CPP
1. Instantiating the XTLM Utilities
2. Transmitting the data using send_data (data_val, tlast) API
3. Receiving the data using receive_data_with_size API(expected_data_size)
Step-3: Run the Traffic Generator with AIE Simulation
Launching the external script
Launching the AIEsim process
Make Utility to run all the flows
Viewing the Results in the Vitis Analyzer
Run External Traffic Generators with Emulation Flow
Step-1: ADF Graph Modifications
Step-2: Update the host code to only control the graph
Step-3: Linking for Hardware Emulation
Step-4: Writing the External Traffic Generator
Matlab
1. Instantiating the XTLM Utilies
2. Transmitting the data using send_data (data_val, tlast) API
3. Receiving the data using receive_data_with_size API(expected_data_size)
CPP
1. Instantiating the XTLM Utilities
2. Transmitting the data using send_data (data_val, tlast) API
3. Receiving the data using receive_data_with_size API(expected_data_size)
Step-5: Run the Traffic Generator with Emulation Flow
Launching the external script
Launching the Emulation process
SW Emulation
Make Utitlity to run all the flows
Using RTL IP with AI Engines
Introduction
Objectives
Tutorial Overview
Step 1 - Creating custom RTL kernels with the Vivado Design Suite
Step 2 - Creating HLS kernels with Vitis compiler
Step 3 - Interfacing ADF graph to Programmable Logic
Step 4 - Building XCLBIN
Step 5 - Build Host Application
Step 6 - Package
Step 7 - Run Emulation
To View Emulation Waveforms
Summary
Using Verilog Traffic Generators in AIE Simulation
Introduction
Objectives
Documentation
Explore AI Engine Architecture
Traffic Generator
Installing the Tools
Tutorial Overview
Section 1: Overview of the Design that Will be Used in this Tutorial
Directory Structure
Section 2: How to Integrate External RTL (Verilog/SV) Based Traffic Generator with AIE
Generating AIE Wrapper Stub Module
Adding User RTL and External Testbench
Instantiating aie wrapper in the External Testbench
Generating sim_ipc_axis IPs for Vivado Project
Section 3: Launch the external process with AIEsim process
Section 4: External RTL simulation in XSIM and Other Third Party Simulator Support
Section 5: More on Traffic Generators
Summary
Support
AI Engine Compiler Features
Conditional Objects Instantiation
Introduction
Basics of Conditional Instantiation
Conditional Usage Examples
Case 1: Conditional Cascade Port
Case 2: Conditional Array of Sub-Graphs
Case 3: Conditional Sequential Sub-Graphs
Case 4: Conditional RTP Ports
Support
Multirate AI Engine Graphs
Introduction
Multirate Examples
I/O-buffer Interface
UpConv then DownConv
DownConv then UpConv
Split and Merge
Stream Interface
No Repetition Count Indicated
UpConv then DownConv
DownConv then UpConv
Split and Merge
Support
Data Multicasting
Introduction
Case 1: Stream and Buffer Multicasting
Case 2: Multirate Buffer Multicasting
Support
Design Tutorials
Versal Custom Thin Platform Extensible System
AI Engine Datamover Examples
Object oriented C++ template class kernels with AI Engine API style
Description of the AIE kernels - C function style kernels (unused in graph)
Short description of vector datamover optimizations
Special note on preparing the loop
Compile and build AIE design
Simulating and checking the AIE kernels
References
AI Engine Documentation
Vitis Unified Software Development Platform 2021.2 Documentation
Revision History
LeNet
Introduction
Tutorial Overview
Before You Begin
Tools: Installing the Tools
Environment: Setting Up the Shell Environment
Super Sampling Rate FIR Filters
Single-Kernel FIR Filter Implementation
Filter Description
Designing the Kernel
Interfaces
Data and Coefficients Management
Coefficients and Data Update Scheduling
Compilation and Analysis
Support
Multi-Kernel FIR Filter Implementation
Designing the Kernel
C++ Code Analysis
Data and Coefficients Management and Operation Scheduling
Compilation and Analysis
Support
Single-Stream Interface
Super Sampling Rate FIR Filter
Super Sampling Rate and Polyphase
Organize Computation for a 2.5 Gsps Data Stream in 2 Phases
Designing the Graph
C++ Code Analysis
Compilation and Analysis
Support
Super Sampling Rate FIR Filter with Dual-Stream Input
Dual-Stream Input Impact
Designing the Graph
C++ Code Analysis
Compilation and Analysis
Support
Beamforming
Module 01: Custom Platform
Options Table
Dependencies
Build Products
Introduction: What is a Custom Vitis Embedded Platform?
What is the Hardware Platform?
What is the Software Platform?
Platform Vivado Project
Create Platform Vivado Project
Create Block Design
Port Instantiation
AI Engine
AXI Debug Hub IP and Simulation Clock and Reset Generator IP
AXI SmartConnects
AXI Verification IPs
Clock Infrastructure
CIPS
NoC
Create Interface Connections
Clock Connections
AXI SmartConnect Connections
CIPS and NoC Connections
NoC Connections
Clocking Infrastructure Connections
CIPS Clocks
Create Address Segments
Set Platform Attributes
Control Interfaces Requirements
Memory Interface Requirements
Clock Requirements
Set Platform Attributes with for Loops
DDR4 Constraints
Create Wrapper for Block Design
Post Link Tcl Commands
Timing Closure
Emulation Setup
Platform Output Type
Wrap Up Vivado Project
Export Hardware XSA*
Software Platform
Platform Create
Domain Create: AI Engine
Domain Create: Linux
BIF File
Boot Directory
Domain Create: Bare Metal
Generate Platform
References
Support
Module 02: AI Engine Design
Options Table
Dependencies
Build Products
Introduction
AI Engine Kernels, Graphs, and Applications
AI Engine Kernels and Graphs
Cascading Chain Subgraph
Downlink Subgraph
Uplink Subgraph
Test Beamforming Graph
AI Engine Application
Sending Data to the Beamforming Kernels
AI Engine Kernels Parameters
AI Engine Subgraph Window Connections
AI Engine Application Data Files
Simulating the AI Engine Graph Application
Run-Time Event API for Performance Profiling
Conclusion
References
Support
Module 03: PL Design
Dependencies
PL Master Kernels
PL Slave Kernels
Build Products
PL Kernels: Master and Slaves
PL Master Kernels
PL Master Execution Flow
Reset
Configuration
BLOCK_SIZE
NITER and ROLLOVER_ADDR
Start
Done
IP Kernelization
PL Slave Kernels
PL Slave Execution Flow
Reset
Configuration
BLOCK_SIZE
NITER and ROLLOVER_ADDR
Start
Done
AXI4-Stream Register Slice
Beamforming Design: Downlink AI Engine Graph
Beamforming Design: Uplink AI Engine Graph
References
Support
Module 04: AI Engine and PL Integration
Building the Design
Build XCLBIN from Scratch
Options
Dependencies
Build Products
Introduction: Linking the System
Timing Summary
REV0: vck190_v1_0_wrapper_timing_summary_routed.rpt
REV1: vck190_v1_0_wrapper_timing_summary_routed.rpt
REV0 Configuration File (config.ini)
[connectivity] Section
Number of Kernels
Streaming Connections
[clock] Section
[advanced] Section
New XSA Platform: rev0
Timing Closure
Timing Closure Strategy
REV1: Configuration File (config_2regslice.ini)
[connectivity] Section
[clock] Section
[vivado] Section
New XSA Platform: rev1
References
Support
Module 05: Bare-Metal PS Host Application
Introduction: Building a Bare-Metal System
Building the Design
Difference between main_partial.cpp and main_full.cpp
Generating the Platform
Compiling the PS Application Source Code
Linking the PS Application Source Code
Bare-Metal Source Code
PS Host Application
Main Function
test_dlbf/test_ulbf Functions
Reset
Configuration
Check RAM
Start
Wait for Done: Inputs
Wait for Done: Outputs
Verify Output
Test ULBF
References
Support
Module 06: System Integration: Bare Metal
Building the Design: Hardware Emulation
Dependencies
Build Products
Running the System: Hardware Emulation
Building the Design: Hardware
Dependencies
Build Products
Running the System: Hardware
References
Support
Module 07: PetaLinux
Differences between Bare Metal and PetaLinux
Building the Design
Building the PetaLinux Software Platform
Create PetaLinux: Creating the PetaLinux Project with a BSP
Config PetaLinux: Updating the PetaLinux Project with an XSA
Config PetaLinux: Customizing the Root File System
Config Petalinux: Updating the Device Tree
Config Petalinux: Customizing Kernel Configuration
Config Petalinux: Clean-Up
Build PetaLinux: Building the PetaLinux Image
Build Petalinux: Building the SDK (Target Sysroot Generation)
Build PetaLinux: Installing the SDK (Target Sysroot Generation)
Build PetaLinux: Generating the Boot Image
Build the Versal Custom PetaLinux Platform
References
Support
Module 07: PetaLinux boot
Module 08: Linux SW Application
Introduction: Programming the PS Host Application
Execution Flow Chart
Bind UIO Drivers with PL Kernels
Changes in 2023.2
Load AIE XCLBIN
Reset AI Engine
Load AI Engine with XCLBIN
Reset AI Engine in the Middle of Execution
Command-Line Arguments
Support
Module 09: System Integration: Linux
Running the System
Support
Polyphase Channelizer
Polyphase Channelizer
Introduction
Channelizer Requirements
MATLAB Model
System Partitioning
Clock Rate and SSR Planning
Circular Buffer
Polyphase Filterbank
Cyclic Shift Buffer
IDFT
Design Overview
Polyphase Filterbank Design
Discrete Fourier Transform Design
Build and Run Design
Setup & Initialization
Hardware Emulation
Hardware
Estimating Power Using the Power Design Manager
Step 1: Building the Design for VCK190 and Executing Power Targets
Step 2: Creating a New Project
Step 3: Refining the AI Engine Power Estimate Using Simulated Design and Switching Activities
References
Support
License
Prime Factor FFT
Prime Factor FFT-1008
Introduction
Matlab Models
I/O Permutations (2D Case)
I/O Permutations (3D Case)
Design Overview
INPUT PERMUTE Kernel
FFT-7 Kernel
TRANSPOSE1 Kernel
FFT-9 Kernel
TRANSPOSE2 Kernel
FFT-16 Kernel
OUTPUT PERMUTE Kernel
Design Resources
Build and Run Design
Setup & Initialization
Hardware Emulation
Hardware
References
Support
License
2D-FFT
AIE
2023.2 Versal 2D-FFT Implementation Using Vitis Acceleration Library Tutorial (XD073)
AI Engine Implementation
Building the Design
Make Steps
Build the Entire Design with a Single Command
make kernels: Compiling PL Kernels
make graph: Creating the AI Engine ADF Graph for Vitis Compiler Flow
make xsa: Using the Vitis Tools to Link AI Engine and HLS Kernels with the Platform
make application: Compiling the Host Application
make package: Packaging the Design
make run_emu: Running Hardware Emulation
Running on Hardware
Hardware Design Details
Design Details
AI Engine and PL Kernels
dma_hls
Software Design Details
AI Engine Kernels and Graph Representation
Adaptive Data Flow (ADF) Graph
Defining the Graph Class
Top-Level Application
PL Data Mover Kernel
dma_hls (dma_hls.cpp)
Top Function Declaration
Top Function Definition
PS Host Application
HLS
2023.2 Versal 2D-FFT Implementation Using Vitis Acceleration Library Tutorial (XD073)
HLS Implementation
Building the Design
Make Steps
Build the Entire Design with a Single Command
make kernels: Compile PL Kernels
make xsa: Using the Vitis Tools to Link HLS Kernels with the Platform
make application: Compile the Host Application
make package: Packaging the Design
make run_emu: Running Hardware Emulation
Running on Hardware
Hardware Design Details
Design Details
HLS/PL Kernels
FFT_2D
DMA_HLS
Software Design Details
HLS/DSP Kernel Representation
Data Flow
Define FFT Inputs
Required Headers and Function Declarations
FFT Core Config Structure
Top Function
Sub-Function Details
Reading Data
FFT Function
Writing Out Data
PL Data Mover Kernel
dma_hls (dma_hls.cpp)
Top Function Declaration
Top Function Definition
PS Host Application
FIR Filter
AIE
Building the Design
Make Steps
Build the Entire Design with a Single Command
make kernels: Compile PL Kernels
make graph: Creating the AI Engine ADF Graph for Vitis Compiler Flow
make xsa: Use Vitis Tools to Link AI Engine and HLS Kernels with the Platform
make application: Compile the Host Application
make package: Package the Design
make run_emu: Run Hardware Emulation
Run on Hardware
Hardware Design Details
Design Details
AI Engine and PL Kernels
Software Design Details
Data Flow Graph
Define the Graph Class
Instantiate DSPLib FIR Filters
Add Connectivity Information
Top Level Application
PL Kernels
datamover (datamover.cpp)
Arguments
pragma HLS INTERFACE s_axilite
pragma HLS INTERFACE axis
pragma HLS PIPELINE II=1
PS Host Application
Include graph.cpp
load_xclbin Function
Datamover Class
FIR Chain Class
Main Function
1. Check Command Line Argument
2. Open XCLBIN
3. Create and Initialize Data Mover Kernels and FIR Chain Graph
4. Run the Data Mover Kernel and FIR Chain Graph
5. Wait for Data Mover Kernels to Complete
6. Verify Output Results
7. Release Allocated Resources
References
AI Engine Documentation
Support
HLS
Building the Design
Make Steps
Build the Entire Design with a Single Command
make kernels: Compile PL Kernels
make xsa: Use Vitis Tools to Link HLS Kernels with the Platform
make application: Compile the Host Application
make package: Package the Design
make run_emu: Run Hardware Emulation
Run on Hardware
Hardware Design Details
Design Details
HLS PL Kernels
Software Design Details
N-Body Simulator
Running the Simulation
Python Simulations on x86 Machine
Results
(Optional) Creating Animation GiFs
Next Steps
Support
Build the Design
AI Engine Design
A Single Nbody() Kernel
Four NBody() Kernels Packet Switched
Workload Distribution and input_j
100 N-Body Subsystems
Why Packet Switching?
(Optional) Simulate the AI Engine Design
References
Next Steps
Support
Building the Design
Step 1: Set the Vitis Utility Library path
Step 2: Generate m2s_x2.cpp and s2m_x4.cpp Datamover kernels
Step 3: Compile HLS PL Kernels
HLS PL Kernels
m2s_x2
packet_sender
packet_receiver
s2m_x4
References
Next Steps
Support
Full System Design
Full System Design
Design Implementation
References
Next Steps
Support
Host Software
Step 1: Compile Host Software
Step 2: Link Host Software
Host Software
NBodySimulator API
Logger API
Log Levels:
Host Applications
References
Next Steps
SD Card Image Generation and Hardware Run
SD Card Image Generation
Booting the VCK190 Board
Running the Design on Hardware
References
Next Steps
Support
Results
Results
Latency Performance Comparisons
Design Throughput Calculations (Effective vs. Theoretical)
(Optional) Building x1_design and x10_design
Building the x1_design (simulates 128 particles)
Building the x10_design (simulates 1,280 particles)
Support
x10 Design Results
x1 Design Results
Digital Down-conversion Chain: Converting from Intrinsics to API
Table of Contents
Introduction
Upgrading Tools, Device Speed Grade, and Makefile
Upgrading the Code
Converting Kernel Functions to Kernel Classes
Migrating from Windows to Buffers
Replacing Intrinsics with APIs
Relocating Global Variables to Kernel Class Data Members
Handling State Variables to Enable x86sim
Updating Older Pragmas
Supporting x86 Compilation and Simulation
Building and Running the Design
Setup and Initialization
x86 Functional Simulation
Hardware Simulation
Summary
Support
License
Versal GeMM Implementation
GeMM AI Engine Implementation
Building the Design
Make Steps
Build the Entire Design with a Single Command
make kernels: Compiling PL Kernels
make graph: Creating the AI Engine ADF Graph for Vitis Compiler Flow
make xsa: Using the Vitis Tools to Link AI Engine and HLS Kernels with the Platform
make application: Compiling the Host Application
make package: Packaging the Design
make run_emu: Running Hardware Emulation
Running on Hardware
Hardware Design Details
Design Details
AI Engine and PL Kernels
dma_hls
Software Design Details
GeMM DSP58 Implementation
Building the Design
Make Steps
Build the Entire Design with a Single Command
make kernels: Generates the PL Kernels
make xsa: Using the Vitis Tools to Link HLS Kernels with the Platform
make application: Compile the Host Application
make package: Packaging the Design
make run_emu: Running Hardware Emulation
Running on Hardware
Hardware Design Details
PL Kernel Details
Platform Details
Software Design Details
Bilinear Interpolation
Introduction
Computing Interpolated Values
Design Assumptions
Programmable Logic Interface
PLIO Interface
AI Engine Test Vectors
AI Engine Code Vectorization
AI Engine Floating-Point Vector Unit
AI Engine Kernel Processing
Running the Example
Generating Test Vectors
Running x86 Simulation
Running AI Engine Simulation
Analyzing Results
Vitis Analyzer
Test Vector Comparison
Customizing the Example
Specifying a Test Image and Output Resolution
Multicore Processing
References
Support
License