1. The functionality of the CUs - 2024.2 English
Vitis Libraries
Release Date
2024-11-29
Version
2024.2 English
Vitis Libraries
Release Information
Vitis Data Compression Library
Vitis DSP Library
Vitis Motor Control Library
Vitis Vision Library
New Features and Functions
Known issues
Developer Guide
Compilation and Execution
Set Up the Environment
HLS Cases Command Line Flow
Run a L1 Example
Vitis Cases Command Line Flow
Run a L2 Example
Run a L3 Example
Executing Vitis Library in the Vitis IDE
Download the Library Template in the IDE
Create the L2 Application Library Project in the IDE
Create the L3 Application Library Project in the IDE
Libraries
Vitis BLAS Library
Introduction
Overview
Software Platform
PCIe Accelerator Card
Release Note
2020.1
2021.1
2024.1
User Guide
L1 Primitives User Guide
L1 API Overview
Introduction
1. Introduction
2. L1 Primitives Usage
3. Matrix storage used in L1 primitives
L1 Compute APIs
amax
amin
asum
axpy
copy
dot
gbmv
gemv
nrm2
scal
swap
symv
trmv
L1 Data mover
1. Matrix Storage Format
2. Data Mover APIs
sbmSuper2Stream
sbmSub2Stream
gbm2Stream
vec2GbMatStream
tbmSuper2Stream
tbmSub2Stream
vec2TbUpMatStream
vec2TbLoMatStream
gem2Stream
vec2GemStream
symUp2Stream
symLo2Stream
spmUp2Stream
spmLo2Stream
vec2SymStream
trmUp2Stream
trmLo2Stream
tpmUp2Stream
tpmLo2Stream
vec2TrmUpStream
vec2TrmLoStream
readVec2Stream
writeStream2Vec
L1 Test
Python Environment Setup Guide
L2 Kernels User Guide
L2 API Overview
Introduction
1. Introduction
2. L2 Kernel Usage
Blas Function Kernel
GEMM Kernel
Architecture
Systolic Array
Matrix Block Partition
Data Movers
Transpose
Double Buffers
L2 API benchmark
L2 GEMM benchmark
1. gemm_4CU
1.1 Executable Usage
1.1.1 Work Directory (Step 1)
1.1.2 Build the Kernel (Step 2)
1.1.3 Run the Kernel (Step 3)
1.1.4 Example Output (Step 4)
1.2 Profiling
L2 GEMV benchmark
1. gemvStreamCh16
1.1 Executable Usage
1.1.1 Work Directory (Step 1)
1.1.2 Build the Kernel (Step 2)
1.1.3 Run the Kernel (Step 3)
1.1.4 Example Output (Step 4)
1.2 Profiling for the Alveo U280
1.3 Profiling for the Alveo U50
L3 API User Guide
L3 API Overview
Introduction
1. Introduction
1.1 Data Layout
1.2 Memory Allocation
Restricted Memory Version
Default Memory Version
Pre-allocated Memory Version
1.3 Supported Datatypes
2. Using the Vitis BLAS API
2.1 General Description
2.1.1 Error Status
2.1.2 Vitis BLAS Initialization
2.2 Datatypes Reference
2.2.1 xfblasStatus_t
2.2.2 xfblasEngine_t
2.2.3 xfblasOperation_t
2.3 Vitis BLAS Helper Function Reference
2.3.1 xfblasCreate
2.3.2 xfblasFree
2.3.3 xfblasDestroy
2.3.4 xfblasMalloc
2.3.5 xfblasSetVector
2.3.6 xfblasGetVector
2.3.7 xfblasSetMatrix
2.3.8 xfblasGetMatrix
2.3.9 xfblasMallocRestricted
2.3.10 xfblasSetVectorRestricted
2.3.11 xfblasGetVectorRestricted
2.3.12 xfblasSetMatrixRestricted
2.3.13 xfblasGetMatrixRestricted
2.3.14 xfblasMallocManaged
2.3.15 xfblasExecute
2.3.16 xfblasExecuteAsync
2.3.17 xfblasGetByPointer
2.4 Vitis BLAS Function Reference
2.4.1 xfblasGemm
3. Obtain FPGA bitstream
L3 API example
L3 API benchmark
L3 API GEMM benchmark
1. streamingKernel
1.1 Executable Usage
1.1.1 Work Directory (Step 1)
1.1.2 Build the Kernel (Step 2)
1.1.3 Run the Kernel (Step 3)
1.1.4 Example Output (Step 4)
1.1.5 Use the Script to Run the Benchmark
1.2 Profiling
L3 API test
L3 Python bindings
1. Introduction
1.1 Set Python Environment
1.2 Build the Shared Library
2. Using the Vitis BLAS L3 Python API
2.1 General Description
2.1.1 Vitis BLAS Initialization
2.2 Vitis BLAS Helper Function Reference
2.3 Using Python APIs
Python Environment Setup Guide
Benchmark
1. Performance
1.1 gemv
1.2 gemm
2. Benchmark Test Overview
2.1 Prerequisites
2.1.1 Vitis BLAS Library
2.2 Building
2.2.1 Download Code
2.2.2 Set Up the Environment
Vitis Codec Library
Introduction
Overview
Requirements
Software Platform
PCIE Accelerator Card
License
Trademark Notice
Release Note
2022.1
Vitis Codec Library Tutorial
Vitis Codec and Hardware Acceleration
Lab-1: How Vitis Codec Library Works
Get the Vitis Codec Library
Get the Dependencies
Setup Environment
Download the Vitis Graph Library
Command to Run L1 cases
Command to Run L2 cases
Lab-2: Using L1-level API to evaluate JPEG decoding acceleration
Lab purpose
Operation steps
(1) Learn about run_hls.tcl file
(2) CSIM:
(3) Synthesis:
(4) COSIM:
(5) Design with export
Lab summary
Lab-3: Using L2-level API to implement a single-kernel acceleration for JPEG decoding
Lab purpose
Operation steps
(1) Understand the Work Directory
(2) Build kernel for different modes
(3) Run kernel in Software-Emulation mode
(4) Run kernel in Hardware-Emulation mode
(5) Run kernel in Hardware
Lab summary
Lab-4: Using multi-kernel solution to accelerate WebP encoding based on open-source project
Lab purpose
Operation steps
(1) Open source project analysis and kernel partition
(2) Project files for multi-kernel design
(3) Software Emulation
(4) Hardware Emulation
(5) Hardware Build and Check Resource Consumption
(6) Hardware Running
Lab summary
Tutorial Summary
L1 User Guide
API Document
namespace codec
namespace details
mcu_decoder
hls_next_mcupos2
namespace internal
enum xf::codec::internal::Type
struct xf::codec::internal::HybridUint
enum xf::codec::COLOR_FORMAT
struct xf::codec::decOutput
struct xf::codec::hls_compInfo
struct xf::codec::hls_huff_DHT
struct xf::codec::hls_huff_segment
struct xf::codec::img_info
struct xf::codec::sos_data
Design Internals
kernelParserDecoderTop
JPEG Huffman Decoder
Executable Usage
Profiling
Internal Design of Order Tokenize
Overview
Implemention
Profiling
L2 User Guide
API Document
namespace codec
namespace details
struct xf::codec::details::hls_huff_DHT
struct xf::codec::details::hls_huff_segment
struct xf::codec::details::sos_data
template class xf::codec::details::BicubicInterpolator
enum xf::codec::COLOR_FORMAT
struct xf::codec::bas_info
struct xf::codec::cmp_info
struct xf::codec::img_info
Design Internals
JPEG Decoder
Overview
Algorithm
Implemention
Profiling
PIK Encoder
Internal Designs
Executable Usage
Profiling
Result
Lepton Encoder
Internal Designs
Software and system requirements
Building the accelerated Lepton encoder
Running the accelerated Lepton encoder
Performance
WebP Encoder
Implementation
Performance
Software and system requirements
Building the accelerated WebP encoder
Running the accelerated WebP encoder
Resize Down
Overview
Implementation
Interface
JXL Encoder
Overview
Executable Usage
Profiling
Result
Benchmark
JPEG Decoder
Executable Usage
Profiling
PIK Encoder
Executable Usage
Profiling
Result
Resize
Executable Usage
Profiling
Webp Encoder
Executable Usage
Profiling
JXL Encoder
Executable Usage
Profiling
Result
Vitis Database Library
Introduction
Overview
Overview
Generic Query Engine
Release Note
2022.2
2022.1
2021.2
2021.1
2020.2
2020.1
2019.2
Internal Release
Requirements
FPGA Accelerator Card
Software Platform
Development Tools
Dependency
Design Flows
Shell Environment
HLS Cases Command Line Flow
Vitis Cases Command Line Flow
Vitis Database Library Tutorial
Relational Database and Hardware Acceleration
How the Vitis Database Library Works
L3 API – General Query Engine
Target Audience and Major Features
Example Usage
L2 API – GQE Kernels
Target Audience and Major Features
Command to Run L2 Cases
L1 API
Target Audience and Major Features
Command to Run L1 Cases
User Guide
L1 Module User Guide
Primitive Overview
1. Stream-based Interface
2. Implementing Scan
3. Implementing Hash
4. Implementing Filter
5. Implementing Evaluation
6. Implementing Bloom Filter
7. Implementing Join
7-1. Join Implementation Summary
7-2. Hash-Join
7-3. Hash-Semi-Join
7-4. Hash-Anti-Join
7-5. Hash-Multi-Join
7-6. Merge-Join and Merge-Left-Join
7-7. Nested-Loop-Join
8. Implementing Group-by Aggregation
8-1. Sorted Rows Group-Aggregate
8-2. On-Chip Group-Aggregate
8-3. Off-chip Group-Aggregate
9. Implementing Hash Partition
10. Implementing Sort
10-1. Sort Implementation Summary
10-2. Bitonic-Sort
10-3. Insert-Sort
10-4. Merge-Sort
11. Glue Logic
11-1. Combine and Split Columns
Primitive APIs in ``xf::database``
aggregate
aggregate overload (1)
aggregate overload (2)
aggregate overload (3)
bitonicSort
bfGen
bfGenStream
bfCheck
combineCol
combineCol overload (1)
combineCol overload (2)
combineCol overload (3)
combineCol overload (4)
splitCol
splitCol overload (1)
splitCol overload (2)
splitCol overload (3)
splitCol overload (4)
compoundSort
directGroupAggregate
directGroupAggregate overload (1)
directGroupAggregate overload (2)
duplicateCol
dynamicEval
dynamicEvalV2
dynamicFilter
dynamicFilter overload (1)
dynamicFilter overload (2)
dynamicFilter overload (3)
dynamicFilter overload (4)
groupAggregate
groupAggregate overload (1)
groupAggregate overload (2)
groupAggregate overload (3)
groupAggregate overload (4)
hashAntiJoin
hashGroupAggregate
hashJoinMPU
hashJoinMPU overload (1)
hashJoinMPU overload (2)
hashJoinV3
hashBuildProbeV3
hashJoinV4
hashBuildProbeV4
hashLookup3
hashLookup3 overload (1)
hashLookup3 overload (2)
hashLookup3 overload (3)
hashMultiJoin
hashMultiJoinBuildProbe
hashMurmur3
hashMurmur3Hive
hashPartition
hashSemiJoin
insertSort
insertSort overload (1)
insertSort overload (2)
mergeJoin
mergeLeftJoin
mergeSort
mergeSort overload (1)
mergeSort overload (2)
nestedLoopJoin
scanCmpStrCol
scanCol
scanCol overload (1)
scanCol overload (2)
scanCol overload (3)
scanCol overload (4)
scanCol overload (5)
scanCol overload (6)
scanCol overload (7)
scanCol overload (8)
scanCol overload (9)
scanCol overload (10)
scanCol overload (11)
scanCol overload (12)
scanCol overload (13)
staticEval
staticEval overload (1)
staticEval overload (2)
staticEval overload (3)
staticEval overload (4)
Primitive Design Internals
Internals of Dynamic-Filter
Internal Structure
Limitations
Generating Config Bits
Internals of Dynamic-Evaluation
Internals of Lookup3 and Murmur3 Hash
Murmur3 and Lookup3 Hash Introduction
Acceleration of Murmur3 and Lookup3 Hash
Internals of Bloom-Filter
Internals of Group-Aggregate (Using Sorted Rows)
Internals of Direct-Group-Aggregate
Internals of Hash-Group-Aggregate (Generic Version)
Internals of Hash-Join (Multi-Process-Unit Version)
Internals of Hash-Join-v3 and Hash-Build-Probe-v3
Internals of Hash-Join-v4 and Hash-Build-Probe-v4
Internals of Hash-Semi-Join (Multi-Process-Unit Version)
Principle
Structure
Internals of Hash-Anti-Join
Internals of Hash-Multi-Join
Internals of Hash-Partition
Internals of Merge-Join and Merge-Left-Join
User Guide
Structure
Internals of Nested-Loop-Join
User Guide
Structure
Internals of Combine-Split-Unit
Internals of Bitonic Sort
Internals of Insert Sort
Principle
Synthesis Results
Implementation Results
Internals of Merge Sort
Principle
Synthesis Results
Implementation Results
Internals of Scan
Query-Specific Acceleration Demo
TPC-H Query 5 Simplified
TPC-H Query 5
TPC-H Query 6 Modified
L2 GQE Kernel User Guide
GQE Kernel Design
3-in-1 Kernel
Meta Information
Unified Kernel Command
Join Flow
Bloom-Filter Flow
64-bit Partition Flow with Bloom Filter Build/Probe
Aggregate Kernel
GQE Kernel APIs
gqeKernel
gqeAggr
gqePart
GQE Kernel Configuration APIs
class xf::database::gqe::KernelCommand
Overview
Methods
KernelCommand
setBypassOn
setJoinOn
setJoinType
setJoinAppendMode
setBloomfilterOn
setBloomfilterSize
setPartOn
setLogPart
setAggrOn
setDualKeyOn
setJoinBuildProbe
setBloomfilterBuildProbe
setScanColEnable
setWriteColEnable
setRowIDValidEnable
setFilter
getConfigBits
class xf::database::gqe::AggrCommand
Overview
Methods
AggrCommand
Scan
setEvaluation
setEvaluation overload (1)
setEvaluation overload (2)
setFilter
setFilter overload (1)
setShuffle0
setShuffle1
setShuffle2
setShuffle3
setGroupAggr
setGroupAggrs
setMerge
columnMerge
setDirectAggrs
setWriteCol
getConfigBits
getConfigOutBits
L3 GQE Overlay User Guide
GQE L3 Design
Overview
Joiner Design
Workshop Design
Bloom-Filter Design
Class Specifications
Example Usage
Group-By Aggregate Design
GQE L3 APIs
class xf::database::gqe::Table
Overview
Methods
Table
addCol
addCol overload (1)
addCol overload (2)
addCol overload (3)
addCol overload (4)
genRowIDWithValidation
genRowIDWithValidation overload (1)
genRowIDWithValidation overload (2)
genRowIDWithValidation overload (3)
setRowNum
getRowNum
getSecRowNum
getColNum
getSecNum
checkSecNum
getColTypeSize
getColPointer
getValColPointer
getValColPointer overload (1)
getColPointer
setColNames
getColNames
getRowIDColName
getValidColName
getRowIDEnableFlag
getValidEnableFlag
~Table
info
class xf::database::gqe::Joiner
Overview
Inherited Members
Methods
Joiner
run
class xf::database::gqe::BloomFilter
Overview
Methods
BloomFilter
build
merge
getHashTable
getBloomFilterSize
class xf::database::gqe::Filter
Overview
Inherited Members
Methods
Filter
~Filter
run
class xf::database::gqe::Aggregator
Overview
Methods
Aggregator
aggregate
class xf::database::gqe::TableSection
class xf::database::gqe::Workshop
Benchmark Result
Benchmark
Compound Sort
Executable Usage
Profiling
Hash Anti-join
Dataset
Executable Usage
Profiling
Hash Group Aggregate
Dataset
Executable Usage
Profiling
Hash Join V2
Dataset
Executable Usage
Profiling
Hash Join V3
Dataset
Executable Usage
Profiling
Hash Join V4
Dataset
Executable Usage
Profiling
Hash Multi-Join
Dataset
Executable Usage
Profiling
Hash Semi-Join
Dataset
Executable Usage
Profiling
TPC-H Queries with GQE
Vitis Data Analytics Library
Introduction
Overview
Release Note
2024.1
2023.2
2023.1
2022.2
2022.1
2021.2
2021.1
2020.2
2020.1
Requirements
FPGA Accelerator Card
Software Platform
Development Tools
Design Flows
Vitis Data Analytics Library Tutorial
Data Analytics and Hardware Acceleration
How Vitis Data Analytics Library Works
L3 API – CSV Scanner Engine
Target Audience and Major Features
Command to Run L3 Cases
L2 API – CSV Scanner Kernels
Target Audience and Major Features
Command to Run L2 Cases
L1 API
Target Audience and Major Features
Command to Run L1 Cases
L1 User Guide
Hardware Classes
template class xf::data_analytics::classification::logisticRegressionPredict
Overview
Methods
pickFromK
pick
predict
setWeight
setIntercept
template class xf::data_analytics::regression::linearLeastSquareRegressionPredict
Overview
Methods
setWeight
setIntercept
predict
template class xf::data_analytics::regression::LASSORegressionPredict
Overview
Methods
setWeight
setIntercept
predict
template class xf::data_analytics::regression::ridgeRegressionPredict
Overview
Methods
setWeight
setIntercept
predict
template class xf::data_analytics::common::SGDFramework
Overview
Methods
seedInitialization
setTrainingConfigs
setTrainingDataParams
initGradientParams
calcGradient
updateParams
train
Hardware Functions in ``xf::data_analytics::classification``
Hardware Functions in xf::data_analytics::classification
decisionTreePredict
axiVarColToStreams
naiveBayesTrain
naiveBayesPredict
svmPredict
Hardware Functions in xf::data_analytics::clustering
kMeansPredict
Hardware Functions in xf::data_analytics::regression
decisionTreePredict
Hardware Functions in xf::data_analytics::text
Hardware Functions in xf::data_analytics::dataframe
csvParser
csvParser overload (1)
csvParser overload (2)
jsonParser
readFromDataFrame
writeToDataFrame
Software C Functions
xf_re_compile
Design Internals
CSV Parser
Features
Overall Structure
JSON Parser
Features
Limitations
Overall Structure
Decision Tree (Predict)
Overview
Algorithm (predict)
Implementation (predict)
Profiling
K-Means (Predict)
Linear Regression (Predict)
Linear Least Square Regression
LASSO Regression
Ridge Regression
Implementation (inference)
Logistic Regression (Predict)
Logistic Regression Classifier
Implementation (inference)
Multinomial Naive Bayes
Overview
Implemention
Resource Utilization
Benchmark Result on Board
Internals of svm_predict
Regular Expression Virtual Machine (regex-VM)
Overview
User Guide
Regex-VM Coverage
Regex-VM Usage
Implemention
Profiling
WriteToDataFrame
Data Frame Format (on DDR)
Input Data Stream
Overall Structure
ReadFromDataframe
Input Data
Output Data Stream
Overall Structure
StringCompare
Overview
Implementation
string EQUAL
string IN
string LIKE
Performance and Resource
string IN
string LIKE
L2 User Guide
Kernel Templates in ``xf::data_analytics::clustering``
Kernel Templates in xf::data_analytics::clustering
kMeansTrain
Kernel Templates xf::data_analytics::regression
linearLeastSquareRegressionSGDTrain
ridgeRegressionSGDTrain
LASSORegressionSGDTrain
Kernel Templates in xf::data_analytics::text
reEngine
Kernel Templates in xf::data_analytics::dataframe
csv_scanner
Kernel Templates in xf::data_analytics::geospatial
knn
strtreeTop
Design Internals
Decision Tree (training)
Overview
Basic Algorithm
Implementation
Resource Utilization
Internals of kMeansTaim
Training Resources (Device: Alveo U250)
Training Performance (Device: Alveo U250)
Random Forest (training)
Overview
Basic Algorithm
Implementation
Resource Utilization
Stochastic Gradient Descent Framework
Linear Least Sqaure Regression Training
LASSO Regression Training
Ridge Regression Training
Implementation (Training)
Internals of svm_train
Overview
Basic Algorithm
Implementation
Config description
Resource Utilization
Benchmark Result on the Board
Regular Expression Engine (reEngine)
Overview
User Guide
reEngine Usage
Implemention
Profiling
GeoIP Engine
Overview
Implementation
Input requirements
Kernel Design
Resource Utilization
Two Gram Predicate
Overview
Implementation
Resource Utilization
STRTree Engine
Overview
Algorithm
Implementation
blockSort
mergeTreeSort
Resource Utilization
GeoSpatial K-nearest Neighbors
Overview
Kernel Implemention
End2End Performance
Resource Utilization
L3 User Guide
Software Acceleration Classes
enum xf::data_analytics::text::re::ErrCode
Overview
Detailed Documentation
Enum Values
class xf::data_analytics::text::re::RegexEngine
Overview
Methods
RegexEngine
~RegexEngine
compile
getCpgpNm
match
RegexEngine
class sssd_engine::DataEngineConfig
Overview
Methods
DataEngineConfig
genConfigBits
class sssd_engine::data_engine_sc::DataEngine
Overview
Fields
Methods
DataEngine
~DataEngine
pushRequest
release
class sssd_engine::SmartSSDCache
Overview
Methods
getCardNum
SmartSSDCache
~SmartSSDCache
addFile
scanFile
listFiles
print_input
print_output
release
Regular Expression Acceleration
Getting Started
Limitation
Example Usage
CSV Scanner
Getting Started
Limitation
Example Usage
Benchmark Result
Naive Bayes
Dataset
Executable Usage
Profiling
Support Vector Machine
Dataset
Executable Usage
Profiling
Log Analyzer
Dataset
Executable Usage
Profiling
Duplicate Record Match
Dataset
Executable Usage
Profiling
Vitis Data Compression Library
Introduction
Overview
Software Platform
PCIe Accelerator Card
Release Note
2024.2
2022.2
2022.1
2021.2
2021.1
2020.2
User Guide
Typical Use Cases
L1 Module User Guide
Primitive Overview
Stream-based Interface
Primitive APIs in ``xf::compression``
blockPacker
huffmanDecoderLL
huffmanDecoder
huffmanEncoderStream
lz4Compress
lz4Decompress
lz4DecompressEngine
lz4DecompressEngine_NinMout
lzDecompress
lzMultiByteDecompress
lzBestMatchFilter
lzBestMatchFilter overload (1)
lzBooster
lzBooster overload (1)
lzBooster overload (2)
lzFilter
snappyCompress
snappyDecompress
zstdCompressCore
zstdCompressStreaming
zstdCompressQuadCore
zstdCompressMultiCoreStreaming
zstdDecompressStream
zstdDecompressCore
L2 Kernel User Guide
L2 Kernel Demos
Kernel APIs Reference
Global Functions
xilAdler32
xilChecksum32
xilCrc32
xilGzipCompressFixedStreaming
xilGzipCompBlock
xilGzipComp
xilGzipCompressStreaming
xilLz4Compress
xilLz4CompressStream
xilLz4Decompress
xilLz4DecompressStream
xilLz4Packer
xilSnappyCompress
xilSnappyCompressStream
xilSnappyDecompress
xilSnappyDecompressStream
xilSnappyDecompressStream overload (1)
xilSnappyDecompressStream overload (2)
xilZlibCompressFull
xilHuffmanKernel
xilLz77Compress
xilTreegenKernel
xilZstdCompress
xilZstdCompress overload (1)
xilZstdCompress overload (2)
xilZstdDecompressStream
Demos
List of Demos
AMD GZip Compression and Decompression
Executable Usage
Results
Resource Utilization
Compression
Decompression
Performance Data
Standard GZip Support
AMD LZ4 Compression and Decompression
Results
Resource Utilization
Performance Data
Software and Hardware
Executable Usage
AMD LZ4-Streaming Compression and Decompression
Results
Resource Utilization
Performance Data
Executable Usage
AMD Snappy Compression and Decompression
Results
Resource Utilization
Performance Data
Software and Hardware
Usage
Build Steps
Emulation Flows
Hardware
Executable Usage
AMD Snappy-Streaming Compression and Decompression
Results
Resource Utilization
Performance Data
Executable Usage
Tests
List of Tests
AMD LZ4 Streaming Compression
Executable Usage
Resource Utilization
Performance Data
AMD Snappy Compression
Executable Usage
Resource Utilization
Performance Data
AMD GZip Compression
Executable Usage
Results
Resource Utilization
Performance Data
Standard GZip Support
AMD GZip Streaming Compression
Executable Usage
Results
Resource Utilization
Performance Data
Standard GZip Support
AMD GZip Streaming 16KB Compression
Executable Usage
Results
Resource Utilization
Performance Data
Standard GZip Support
AMD GZip Streaming 8KB Compression
Executable Usage
Results
Resource Utilization
Performance Data
Standard GZip Support
AMD GZip Static Streaming Compression
Executable Usage
Results
Resource Utilization
Performance Data
Standard GZip Support
AMD Zlib Streaming Compression
Executable Usage
Results
Resource Utilization
Performance Data
Standard GZip Support
AMD Zlib Streaming 16KB Compression
Executable Usage
Results
Resource Utilization
Performance Data
Standard GZip Support
AMD Zlib Streaming 8KB Compression
Executable Usage
Results
Resource Utilization
Performance Data
Standard GZip Support
AMD Zlib Streaming Static Compression
Executable Usage
Results
Resource Utilization
Performance Data
Standard GZip Support
AMD ZSTD Compression
Results
Overall Resource Utilization
Performance Data
AMD LZ4 Streaming Decompression
Executable Usage
Resource Utilization
Performance Data
AMD Snappy Streaming Decompression
Executable Usage
Resource Utilization
Performance Data
AMD ZSTD Decompression
Results
Overall Resource Utilization
Performance Data
L3 Overlay User Guide
L3 Overlay APIs
Kernel Design
LZ Data Compression
Overview
Compression Kernel Design
Decompression Kernel Design
Implemented Algorithms
Overlay API Reference
Demos
List of Demos
LZ4 Application
Executable Usage
GZip Application
Overview
Executable Usage
Benchmark Results
Datasets
Compression Performance
De-Compression Performance
Test Overview
Vitis Data Compression Library
Compression Tutorial
Data Compression and Hardware Acceleration
Why Acceleration is Required and How it Helps
How the Data Compression Library Works
L3 API
Executable Usage
L2 API
Commands to Run L2 and L3 Cases
L1 API
Command to Run L1 Cases
Vitis Data Mover Library
Introduction
Overview
Release Note
2023.1
Requirements
Software Platform
Development Tools
Design Flows
Shell Environment
HLS Cases Command Line Flow
Data-Mover User Guide
Static Data-Mover User Guide
Table of Contents
Programmable 4D Data-Mover User Guide
Table of Contents
Vitis DSP Library
Introduction
Overview
Software Platform
PCIe Accelerator Card
Release Note
2024.2
2024.1
2023.2
2023.1
2022.2
2022.1
2021.2
2021.1
2020.2
2020.1
L1 PL DSP Library User Guide
1-Dimensional(Line) SSR FFT L1 FPGA Module
Table of Contents
2-Dimensional(Matrix) SSR FFT L1 FPGA Module
Table of Contents
L2 AIE DSP Library User Guide
Introduction
Navigating Content by Design Process
Organization
Using Library Elements within User Defined Graphs
Known Issues
Vitis Tutorials
DSP Library Functions
Bitonic Sort
Entry Point
Device Support
Supported Types
Template Parameters
Access Functions
Ports
Design Notes
Cascade Feature
Constraints
Code Example
Convolution / Correlation
Entry Point
Device Support
Supported Types
Template Parameters
Access Functions
Ports
Design Notes
IO Buffer Interface
Streaming Interface
Scaling
Saturation
Code Example
Convolution
Correlation
DDS / Mixer
DDS Mixer
Entry Point
Device Support
Supported Types
Template Parameters
Access Functions
Ports
Design Notes
Scaling
Super Sample Rate Operation
Super Sample Rate Sample to Port Mapping
Implementation Notes
Code Example
DDS Mixer using lookup tables
Entry Point
Device Support
Supported Types
Template Parameters
Access Functions
Ports
Design Notes
Scaling
SFDR
Super Sample Rate Operation
Implementation Notes
Code Example
DFT
Entry Point
Device Support
Supported Types
Template Parameters
Access Functions
Ports
Design Notes
Scaling
Batch Processing
Cascaded Kernels
SSR
Maximum Point Size
Zero Padding Data for Alignment
Code Example
FFT/iFFT
FFT/IFFT 1CH (AIE-only)
Entry Point
Device Support
Supported Types
Template Parameters
Access Functions
Ports
Design Notes
Dynamic Point Size
Super Sample Rate Operation
Super Sample Rate Sample to Port Mapping
Scaling
Rounding and Saturation
Cascade Feature
Twiddle mode
Constraints
Use of single_buffer
Code Example
Configuration Notes
Configuration for Performance Versus Resource
Scenarios
Parameter Legality Notes
VSS FFT/IFFT 1CH (AIE + PL)
Entry Point
Device Support
Supported Parameters
Design Notes
Super Sample Rate
Padding Input Data based on Super Sample Rate and Point Size
2D FFT/iFFT
Entry Point
Device Support
Supported Types
Template Parameters
Access Functions
Ports
Code Example
FFT Window
Entry Point
Device Support
Supported Types
Template Parameters
Access Functions
Ports
Design Notes
Dynamic Point Size
Super Sample Rate Operation
Super Sample Rate Sample to Port Mapping
Scaling
Saturation
Constraints
Code Example
Filters
Entry Point
Device Support
Supported Types
Template Parameters
Access Functions
Ports
Design Notes
Coefficient Array for Filters
Static Coefficients
Static Coefficients - Array Size
Reloadable Coefficients
Reloadable Coefficients - Array Size for Non-SSR Cases
Reloadable Coefficients - Array Dimensions for SSR Cases
Window Interface for Filters
Multiple Buffer Ports
Maximum Window Size
Single Buffer Constraint
Streaming Interface for Filters
Stream Output
Stream Input for Asymmetric FIRs
Stream Input for Symmetric FIRs
Setting FIR Frame Size
Setting FIR Length
Maximum FIR Length
Maximum Window Based FIRs Length
Maximum Stream based FIRs Length
Minimum Cascade Length
Optimum Cascade Length
Super Sample Rate
Super Sample Rate - Operation Modes
Super Sample Rate - Resource Utilization
Super Sample Rate - Port Utilization and Throughput
Super Sample Rate - Coefficient and Data Distribution
Super Sample Rate - Coefficient and Data Distribution - Resampling Limitations
Super Sample Rate - Sample to Port Mapping
Super Sample Rate - Interpolation Polyphases
Super Sample Rate - Decimation Polyphases
Constraints
Code Example
Configuration Notes
FIR TDM
Entry Point
Device Support
Supported Types
Template Parameters
Access Functions
Ports
Design Notes
Coefficient Array for Filters
Coefficients - Array Size
Coefficients - Array Organization
IO Buffer Interface for Filters
Margin
Internal Margin
Maximizing Throughput
Multiple Frames
Latency
Maximum Window Size
Single Buffer Constraint
Input Data Samples - Array Organization
Streaming Interface for Filters
Cascaded kernels
Cascade - Operation Mode
Cascade - Resource Utilization
Cascade - Port Utilization
Output type
Super Sample Rate
Super Sample Rate - Operation Mode
Super Sample Rate - Resource Utilization
Super Sample Rate - Port Utilization
Super Sample Rate - Sample to Port Mapping
Constraints
Code Example
Function Approximation
Entry Point
Device Support
Supported Types
Template Parameters
Access Functions
Ports
Design Notes
Input Data
Configuring the Lookup Table
Input Domain Modes
Lookup Utility Functions
Code Example
Hadamard Product
Entry Point
Device Support
Supported Types
Template Parameters
Access Functions
Ports
Design Notes
Super Sample Rate Operation
Scaling
Saturation
Constraints
Code Example
Kronecker
Entry Point
Device Support
Supported Types
Template Parameters
Ports
Design Notes
Inputs
Super Sample Rate (SSR)
Scaling
Constraints
Code Example
Matrix Multiply
Entry Point
Device Support
Supported Types
Template Parameters
Access Functions
Ports
Design Notes
Tiling
Tiling Schemes and Data Type Combinations
Tiling Parameters
Maximum matrix dimensions per kernel
Cascaded Kernels
SSR
Constraints
Code Example
Matrix-Vector Multiply
Entry Point
Device Support
Supported Types
Template Parameters
Access Functions
Ports
Design Notes
Maximum matrix dimensions per kernel
Cascaded Kernels
SSR
Constraints
Code Example
Mixed Radix FFT
Entry Point
Device Support
Supported Types
Template Parameters
Access Functions
Ports
Design Notes
Dynamic Point Size
Super Sample Rate Operation
Scaling
Rounding and Saturation
Cascade Feature
API Type
Constraints
Applying Design Constraints
Code Example
Configuration Notes
Configuration for Performance Versus Resource
Outer Tensor
Entry Point
Device Support
Supported Types
Template Parameters
Access Functions
Ports
Design Notes
Super Sample Rate Operation
Scaling
Saturation
Constraints
Code Example
Sample Delay
Entry Point
Device Support
Supported Types
Template Parameters
Access Functions
Ports
Design Notes
Widget API Cast
Entry Point
Device Support
Supported Types
Template Parameters
Access Functions
Ports
Design Notes
Code Example
Widget Real to Complex
Entry Point
Device Support
Supported Types
Template Parameters
Access Functions
Ports
Design Notes
Code Example
Configuration
Running Config Helper
Config Helper Example
Compiling and Simulating
Library Element Unit Test
Compiling Using the Makefile
Running Compilation
Configuring the Test Case
Selecting TARGET
Troubleshooting Compilation
Compilation Arguments
Stack Size Allocation
Invalid Throughput and/or Latency
Power Analysis
Library Element Configuration Parameters
Common Configuration Parameters
Bitonic Sort configuration parameters
Convolution / Correlation configuration parameters
DDS/Mixer Configuration Parameters
DFT Configuration Parameters
FFT Configuration Parameters
FFT Window Configuration Parameters
FIR Configuration Parameters
Function Approximation configuration parameters
Hadamard Product configuration parameters
Kronecker configuration parameters
Matrix Multiply Configuration Parameters
Matrix Vector Multiply Configuration Parameters
Mixed Radix FFT Configuration Parameters
Outer Tensor configuration parameters
Sample Delay Configuration Parameters
Widgets Configuration Parameters
Benchmark/QoR
Latency and Throughput
Bitonic Sort
Convolution / Correlation
DDS/Mixer
DFT
FFT IFFT DIT 1CH
FFT IFFT 2D
FFT Window
Filters
FIR TDM
Function Approximation
Hadamard Product
Kronecker
Matrix Multiply
Matrix Vector Multiply
Mixed Radix FFT
Outer Tensor
Sample Delay
Widgets
API Reference
API Reference Overview
Bitonic Sort
template class xf::dsp::aie::bitonic_sort::bitonic_sort_graph
Overview
Fields
Methods
bitonic_sort_graph
Convolution / Correlation
template class xf::dsp::aie::conv_corr::conv_corr_graph
Overview
Fields
Methods
conv_corr_graph
DDS Mixer
template class xf::dsp::aie::mixer::dds_mixer::dds_mixer_graph
Overview
Fields
Methods
getKernels
dds_mixer_graph
template class xf::dsp::aie::mixer::dds_mixer::dds_mixer_lut_graph
Overview
Fields
Methods
getKernels
dds_mixer_lut_graph
DFT
template class xf::dsp::aie::fft::dft::dft_graph
Overview
Fields
Methods
getKernels
dft_graph
FFT IFFT
template class xf::dsp::aie::fft::dit_1ch::fft_ifft_dit_1ch_graph
Overview
Fields
Methods
fft_ifft_dit_1ch_graph
template class xf::dsp::aie::fft::dit_1ch::fft_ifft_dit_1ch_graph <TT_DATA, TT_TWIDDLE, TP_POINT_SIZE, TP_FFT_NIFFT, TP_SHIFT, TP_CASC_LEN, TP_DYN_PT_SIZE, TP_WINDOW_VSIZE, kWindowAPI, 0, TP_USE_WIDGETS, TP_RND, TP_SAT, TP_TWIDDLE_MODE, TT_OUT_DATA, TP_INDEX, TP_ORIG_PAR_POWER>
Overview
Fields
Methods
getKernels
fft_ifft_dit_1ch_graph
template class xf::dsp::aie::fft::dit_1ch::fft_ifft_dit_1ch_graph <TT_DATA, TT_TWIDDLE, TP_POINT_SIZE, TP_FFT_NIFFT, TP_SHIFT, TP_CASC_LEN, TP_DYN_PT_SIZE, TP_WINDOW_VSIZE, kStreamAPI, 0, TP_USE_WIDGETS, TP_RND, TP_SAT, TP_TWIDDLE_MODE, TT_OUT_DATA, TP_INDEX, TP_ORIG_PAR_POWER>
Overview
Fields
Methods
fft_ifft_dit_1ch_graph
template class xf::dsp::aie::fft::mixed_radix_fft::mixed_radix_fft_graph
Overview
Fields
Methods
mixed_radix_fft_graph
template class xf::dsp::aie::fft::vss_1d::vss_fft_ifft_1d_graph
Overview
Methods
vss_fft_ifft_1d_graph
2D FFT IFFT
template class xf::dsp::aie::fft::two_d::fft_ifft_2d_graph
Overview
Typedefs
Fields
Methods
fft_ifft_2d_graph
FFT Window
FFT Window utils
Overview
Global Functions
xf::dsp::aie::fft::windowfn::getHammingWindow
xf::dsp::aie::fft::windowfn::getHannWindow
xf::dsp::aie::fft::windowfn::getBlackmanWindow
xf::dsp::aie::fft::windowfn::getKaiserWindow
xf::dsp::aie::fft::windowfn::getRotationMatrix
template class xf::dsp::aie::fft::windowfn::fft_window_graph
Overview
Fields
Methods
getKernels
fft_window_graph
FIRs
template class xf::dsp::aie::fir::decimate_asym::fir_decimate_asym_graph
template struct xf::dsp::aie::fir::decimate_asym::fir_decimate_asym_graph::ssr_params
template struct xf::dsp::aie::fir::decimate_asym::fir_decimate_asym_graph::tmp_ssr_params
template class xf::dsp::aie::fir::decimate_hb::fir_decimate_hb_graph
template struct xf::dsp::aie::fir::decimate_hb::fir_decimate_hb_graph::aieml_ssr_params
template struct xf::dsp::aie::fir::decimate_hb::fir_decimate_hb_graph::ct_fir_params
template struct xf::dsp::aie::fir::decimate_hb::fir_decimate_hb_graph::hb_dec_graph_params
template struct xf::dsp::aie::fir::decimate_hb::fir_decimate_hb_graph::sr_asym_graph_params
template struct xf::dsp::aie::fir::decimate_hb::fir_decimate_hb_graph::tmp_ssr_params
template class xf::dsp::aie::fir::decimate_sym::fir_decimate_sym_graph
struct xf::dsp::aie::fir::decimate_sym::fir_decimate_sym_graph::ssr_params
template class xf::dsp::aie::fir::interpolate_asym::fir_interpolate_asym_graph
struct xf::dsp::aie::fir::interpolate_asym::fir_interpolate_asym_graph::ssr_params
template struct xf::dsp::aie::fir::interpolate_asym::fir_interpolate_asym_graph::tmp_ssr_params
template class xf::dsp::aie::fir::interpolate_hb::fir_interpolate_hb_graph
template struct xf::dsp::aie::fir::interpolate_hb::fir_interpolate_hb_graph::aieml_ssr_params
template struct xf::dsp::aie::fir::interpolate_hb::fir_interpolate_hb_graph::ct_fir_params
template struct xf::dsp::aie::fir::interpolate_hb::fir_interpolate_hb_graph::sr_asym_graph_params
template struct xf::dsp::aie::fir::interpolate_hb::fir_interpolate_hb_graph::ssr_params
template class xf::dsp::aie::fir::resampler::fir_resampler_graph
template struct xf::dsp::aie::fir::resampler::fir_resampler_graph::ssr_params
template struct xf::dsp::aie::fir::resampler::fir_resampler_graph::tmp_ssr_params
template class xf::dsp::aie::fir::sr_asym::fir_sr_asym_graph
struct xf::dsp::aie::fir::sr_asym::fir_sr_asym_graph::first_casc_params
template struct xf::dsp::aie::fir::sr_asym::fir_sr_asym_graph::ssr_params
template struct xf::dsp::aie::fir::sr_asym::fir_sr_asym_graph::tmp_ssr_params
template class xf::dsp::aie::fir::sr_sym::fir_sr_sym_graph
struct xf::dsp::aie::fir::sr_sym::fir_sr_sym_graph::ssr_params
template class xf::dsp::aie::fir::tdm::fir_tdm_graph
struct xf::dsp::aie::fir::tdm::fir_tdm_graph::first_casc_params
template struct xf::dsp::aie::fir::tdm::fir_tdm_graph::ssr_params
Function Approximation
Function Approximation utility functions
Overview
Global Functions
xf::dsp::aie::func_approx::getSqrt
xf::dsp::aie::func_approx::getInvSqrt
xf::dsp::aie::func_approx::getLog
xf::dsp::aie::func_approx::getExp
xf::dsp::aie::func_approx::getInv
template class xf::dsp::aie::func_approx::func_approx_graph
Overview
Fields
Methods
getKernels
func_approx_graph
GeMM
template class xf::dsp::aie::blas::matrix_mult::matrix_mult_graph
Overview
Fields
Methods
getKernels
matrix_mult_graph
GeMV
template class xf::dsp::aie::blas::matrix_vector_mul::matrix_vector_mul_graph
Overview
Fields
Methods
getKernels
matrix_vector_mul_graph
Graph utils
class xf::dsp::aie::empty
class xf::dsp::aie::no_port
Hadamard Product
template class xf::dsp::aie::hadamard::hadamard_graph
Overview
Fields
Methods
getKernels
hadamard_graph
Kronecker
template class xf::dsp::aie::kronecker::kronecker_graph
Overview
Fields
Methods
getKernels
kronecker_graph
Outer Tensor
template class xf::dsp::aie::outer_tensor::outer_tensor_graph
Overview
Fields
Methods
getKernels
outer_tensor_graph
Sample Delay
template class xf::dsp::aie::sample_delay::sample_delay_graph
Overview
Fields
Methods
sample_delay_graph
Widgets
template class xf::dsp::aie::widget::api_cast::widget_api_cast_graph
Overview
Fields
Methods
getKernels
widget_api_cast_graph
template class xf::dsp::aie::widget::real2complex::widget_real2complex_graph
Overview
Fields
Methods
getKernels
widget_real2complex_graph
Vitis Graph Library
Introduction
Overview
Requirements
Software Platform
PCIE Accelerator Card
License
Trademark Notice
Release Note
2024.1
2022.2
2022.1
2021.2
Vitis Graph Library Tutorial
Get and Run the Vitis Graph Library
Get the Dependencies
Setup Environment
Download the Vitis Graph Library
Run a L3 Example
Run a L2 Example
Run a L1 Example
How Vitis Graph Library Works
L3 API
Target Audience
Example Usage
L2 API
Target Audience
Example Usage
L1 API
Target Audience
L1 User Guide
API Document
namespace graph
namespace enums
Design Internals
Similarity Primitives
Internal Design of General Similarity
Interface
Implementation
Profiling and Benchmarks
Internal Design of Sparse Similarity
Interface
Implementation
Profiling and Benchmarks
Internal Design of Dense Similarity
Interface
Implementation
Profiling and Benchmarks
Top K Sort
Overview
Algorithm
Implemention
Profiling
L2 User Guide
API Document
namespace graph
namespace internal
namespace bfs
namespace calc_degree
template union xf::graph::internal::calc_degree::f_cast <ap_uint <32>>
template union xf::graph::internal::calc_degree::f_cast <double>
template union xf::graph::internal::calc_degree::f_cast <ap_uint <64>>
template union xf::graph::internal::calc_degree::f_cast <float>
namespace connected_components
namespace convert_csr_csc
namespace diameter
struct xf::graph::internal::diameter::IdWeight
template union xf::graph::internal::diameter::f_cast <float>
namespace hash_group_aggregate
template struct xf::graph::internal::hash_group_aggregate::COLUMN_DATA
namespace label_propagation
template struct xf::graph::internal::label_propagation::COLUMN_DATA
namespace mis
namespace mssp
namespace mst
namespace pagerank
namespace pagerankMultiChannel
template class xf::graph::internal::pagerankMultiChannel::cache
namespace scc
namespace sssp
namespace nopred
namespace pred
namespace triangle_count
orderStrmInterNum
template struct xf::graph::internal::AggRAM_base
template struct xf::graph::internal::CkKins
struct xf::graph::internal::GetVout
template struct xf::graph::internal::ValAddr
template struct xf::graph::internal::unitCidGain
struct xf::graph::internal::unitCidGain_d
struct xf::graph::internal::unitCidGain_ll
struct xf::graph::internal::unitCidTarKin
struct xf::graph::internal::unitCkKin
struct xf::graph::internal::unitECW
struct xf::graph::internal::unitEW
struct xf::graph::internal::unitKiSelf
struct xf::graph::internal::unitNCkKin
struct xf::graph::internal::unitVC
struct xf::graph::internal::unitVCD
struct xf::graph::internal::unitVCDN
struct xf::graph::internal::unitVCDe
struct xf::graph::internal::unitVCN
struct xf::graph::internal::unitVCNKi
struct xf::graph::internal::unitVD
struct xf::graph::internal::unitVF
union xf::graph::internal::DoubleUnit64
union xf::graph::internal::GetAggout
union xf::graph::internal::GetCout
union xf::graph::internal::GetEout
union xf::graph::internal::StrmBus_L
union xf::graph::internal::StrmBus_M
union xf::graph::internal::StrmBus_S
union xf::graph::internal::StrmBus_XL
template union xf::graph::internal::f_cast <ap_uint <32>>
template union xf::graph::internal::f_cast <unsigned int>
template union xf::graph::internal::f_cast <double>
template union xf::graph::internal::f_cast <ap_uint <64>>
template union xf::graph::internal::f_cast <int>
template union xf::graph::internal::f_cast <float>
template union xf::graph::internal::f_cast <long long>
template union xf::graph::internal::f_cast_ <DT, ap_uint <64>>
template union xf::graph::internal::f_cast_ <DT, ap_uint <32>>
union xf::graph::internal::uint2int32
template class xf::graph::internal::AggRAM
template class xf::graph::internal::AxiMap
template class xf::graph::internal::HashAgg
template class xf::graph::internal::ScanAgg
namespace merge
template class xf::graph::merge::AggRAM
template class xf::graph::merge::HashAgg
template class xf::graph::merge::ScanAgg
template class xf::graph::merge::ShiftUpdate
Design Internals
Internal Design of Breadth-first Search
Overview
Algorithm
Interface
Implementation
Profiling
Internal Design of Single Source Shortest Path
Overview
Algorithm
Interface
Implementation
Profiling
Internal Design of Connected Component
Overview
Algorithm
Interface
Implementation
Profiling and Benchmarks
Internal Design of Strongly Connected Component
Overview
Algorithm
Interface
Implementation
Profiling and Benchmarks
Internal Design of Triangle Counting
Overview
Algorithm
Implemention
Profiling
Benchmark
Internal Design of Label Propagation
Overview
Algorithm
Implementation
Profiling
Benchmark
Internal Design of PageRank
Overview
Implementation
Profiling
Internal Design of PageRankMultiChannels
Overview
Algorithm
Implementation
Profiling
Internal Design of CalcuDgree
Overview
Algorithm
Implementation
Profiling
Internal Design of Convert CSC CSR
Overview
Algorithm
Implementation
Profiling
Internal Design of two hop path count
Overview
Implementation
Interface
Internal Design of Louvain Modularity
Overview
Algorithm
Internal Design of Dense Similarity with Coefficient
Interface
Implementation
Profiling and Benchmarks
Internal Design of Renumber
Overview
Implementation
Interface
Internal Design of Minimum Spanning Tree
Overview
Algorithm
Interface
Implementation
Resource
Internal Design of Estimated Diameter
Overview
Algorithm
Interface
Implementation
Resource
Internal Design of Maximal Independent Set
Overview
Algorithm
Interface
Implementation
Resources
Internal Design of Merge
Overview
Implementation
Algorithm
Conclusion
L3 User Guide
User Guide
Getting Started
Software Requirements
Hardware Requirements
Environment Setup
Build the dynamic library
Run the testcases
Running Examples
Basic Flow
Example
Asynchronous Execution
Example of using multiple requests
Louvain Partition Demo
Linear Louvain Partition Flow
BFS Louvain Partition Flow
Louvain Modularity Launch Demo
Launch u50 Flow
Launch u55c Flow
API Document
namespace xf::graph::L3
louvainModularity
twoHop
pageRankWeight
shortestPath
cosineSimilaritySSSparse
jaccardSimilaritySSSparse
cosineSimilarityAPSparse
jaccardSimilarityAPSparse
cosineSimilaritySSDense
cosineSimilaritySSDenseMultiCardBlocking
cosineSimilaritySSDenseMultiCard
jaccardSimilaritySSDense
cosineSimilarityAPDense
jaccardSimilarityAPDense
knnSimilaritySSSparse
knnSimilarinyAPSparse
knnSimilaritySSDense
knnSimilarityAPDense
knnSimilarityAPSparse
triangleCount
labelPropagation
bfs
wcc
scc
convertCsrCsc
L3 class
class xf::graph::L3::Handle
Overview
Fields
TigerGraph Plugin
Benchmark
Connected Component
Executable Usage
Profiling
Strongly Connected Component
Executable Usage
Profiling
Triangle Counting
Executable Usage
Profiling
Label Propagation
Executable Usage
Profiling
PageRank
Executable Usage
Profiling
PageRank MultiChannels
Executable Usage
Profiling
Single Source Shortest Path
Executable Usage
Profiling
Two hop path count
Executable Usage
Profiling
Louvain Modularity
Executable Usage
Profiling
Renumber
Executable Usage
Profiling
Maximal Independent set
Executable Usage
Profiling
Benchmark
Merge
Executable Usage
Profiling
Vitis HPC Library
Introduction
Overview
Requirements
Software Platform
PCIE Accelerator Card
License
Trademark Notice
Release Note
2020.1
2021.1
User Guide
Python Environment Setup Guide
L1 Primitives User Guide
Introduction of L1 Primitives
RTM Introduction
Mathematics in RTM
1. Wave equation and the finite difference method
1. Imaging
3. Boundary saving scheme
Design information of L1 primitives
1. Stencil2D
1. RTM2D
Forward streaming module
Backward streaming module
3. Stencil3D
1. RTM3D
Conjugate Gradient Solver Introduction
Conjugate Gradient Algorithm
MLP Introduction
L1 APIs
MLP
CG Solver
Reverse Time Migration
Data Movers
L1 Test
1. Set up Python environment
2. Set up Vitis_hls environment
3. Test L1 primitives
L2 Kernels User Guide
Introduction of L2 Kernels
RTM Kernels
2D-RTM forward kernel
2D-RTM backward kernel
3D-RTM forward kernel
MLP Kernels
CG Kernels
GEMV-based Conjugate Gradient Solver with Jacobi Preconditioner
Introduction
Executable Usage
Environment Setup (Step 1)
Build Kernel (Step 2)
Prepare Data (Step 3)
Randomly-Generated Data (Optional)
Users’ data
Run on FPGA with Example Data (Step 4)
Check Device
Benchmark Random Dataset
Usage
Resource Utilization
Benchmark Results on Alveo U50 FPGA
Power Consumption on FPGA
SPMV-based Conjugate Gradient Solver with Jacobi Preconditioner
Introduction
Benchmark on Hardware
Environment Setup (Step 1)
Hardware Build (Step 2)
Prepare Data (Step 3)
Run on FPGA (Step 4)
Check Device
Benchmark
Usage
Resource Utilization on Alveo U280
Benchmark Results on Alveo U280 FPGA
Convergence
L2 Kernel APIs
L2 Kernels Tests Guide
RTM Kernel Test
Set up Python environment
Set up Vitis environment
Test RTM kernels
Test 2D RTM
Forward kernel
Backward kernel
Test 3D RTM
Forward kernel with HBC/RBC boundary condition
CG Kernel Test
Set up Python environment
Set up Vitis environment
Test CG kernels
GEMV-based CG solver
SPMV-based CG solver
FCN Kernel Test
Test FCN kernels
Benchmark
Performance
Conjugate Gradient Algorithm
GEMV-based CG
SPMV-based CG
Benchmark Test Flow
Vitis Motor Control Library
Introduction
Overview
Requirements
Software Platform
PCIE Accelerator Card
License
Trademark Notice
Release Note
2024.2
2023.2
2023.1
L1 User Guide
API Document
Primitive APIs in ``xf::motorcontrol``
namespace details
enum xf::motorcontrol::details::enc_count_mode
enum xf::motorcontrol::details::enc_pos_mode
template struct xf::motorcontrol::details::gen_sampler_pkg
struct xf::motorcontrol::details::pwmPassedArgs
template struct xf::motorcontrol::details::pwmStrmIO
template class xf::motorcontrol::details::QEI
template class xf::motorcontrol::details::deglitcher
enum xf::motorcontrol::FOC_Mode
enum xf::motorcontrol::MODE_PWM_DC_SRC
enum xf::motorcontrol::MODE_PWM_PHASE_SHIFT
Design Internals
SVPWM_DUTY
Overview
Implemention
Profiling
PWM_GEN
Overview
Implemention
Profiling
Benchmark
SVPWM_DUTY
Executable Usage
Profiling
PWM_GEN
Executable Usage
Profiling
Vitis Quantitative Finance Library
Introduction
Overview
Requirements
Software Platform
PCIE Accelerator Card
License
Trademark Notice
Release Note
2023.2
2023.1
Version 1.0
Version 0.5
User Guide
Pricing Models and Numerical Methods
Black-Scholes Model
Overview
Black-Scholes Model
\(Ito\) lemma and direct corollary
Corollary: lognormal property of \(S\)
Implementation of B-S model
Heston Model
Overview
Stochastic Process Equations of the Heston Model
Partial Differential Equation (PED) of Heston Model
Implementations
References
Hull-White Model
Overview
Implementation
Black-Karasinski Model
Overview
Implementation
Cox-Ingersoll-Ross Model
Overview
Implementation
Extended Cox-Ingersoll-Ross Model
Overview
Implementation
Vasicek Model
Overview
Implementation
G2 Model
Overview
Implementation
Monte Carlo Simulation
Overview
Framework
Antithetic paths
Finite Difference Methods
Overview
Implementation
Assumptions/Limitations
Dataflow Description
Precalculate algorithm fixed matrices
Explicit estimation at timestep t
Implicit correction in s
Implicit correction in v
Extract price grid
References
Binomial Tree, Cox-Ross-Rubinstein, Method
Overview
References
Internal Design of Tree Lattice
Overview
Implemention
Heston Model Closed-Form Solution
Overview
References
Merton 76 Closed-Form Solution
Overview
References
Garman-Kohlhagen Closed-Form Solution
Overview
References
Quanto Closed-Form Solution
Overview
References
Hull White Analytic Closed-Form Solution
Overview
Portfolio Optimisation
Caveat
Overview
Global Minimum Variance Portfolio
Efficient Portfolio
Tangency Portfolio
Efficient Portfolio of Risky and Risk Free Assets
Credit Default Swap
Overview
L1 Module User Guide
Random Number Generator
Overview
Uniform Distributed Random Number Generator
Algorithm
Implementation Details
Normal Distributed Random Number Generator (NRNG)
Inverse cumulative distribution transformation based RNG
Box-Muller transformation based NRNG
Multi Variate Normal Distribution RNG
PRNG (xoshiro128)
Overview
Implementation
Singular Value Decomposition (SVD)
Overview
Theory
Jacobi Methods
Implementation
SVD workflow:
Profiling
Tridiagonal Matrix Solver
Overview
Implementation
Pentadiagonal Matrix Solver
Overview
Implementation
Sobol Sequence Generator
Overview
Algorithm
Gray Code Implementation
Sobol Workflow:
Brownian Bridge Transform
Overview
Theory
Generation Algorithm
Implementation
Profiling
Stochastic Process
Overview
Implementation
Ornstein-Uhlenbeck Process
Overview
Implementation
Meshers
Overview
Implementation
Numerical Integration Methods
Overview
Adaptive Trapezoidal Theory
Adaptive Simpson Theory
Romberg Theory
limitations under the License.
Overview
Implementation
Profiling
Covariance Matrix and Regularizaiton
Overview
Algorithm
Covariance Matrix
Covariance Regularizaiton
Implementation
Profiling
Probability Distribution
Overview
Interpolation
Overview
Algorithm & Implementation
Linear interpolation
Cubic interpolation
Bicubic spline interpolation
RNG
RNG
Defined in <xf_fintech/rng.hpp>
XoShiRo128
Defined in <xf_fintech/xoshiro128.hpp>
SobolRsg
Defined in <xf_fintech/sobol_rsg.hpp>
BrownianBridge
Defined in <xf_fintech/brownian_bridge.hpp>
TrinomialTree
Defined in <xf_fintech/trinomial_tree.hpp>
TreeLattice
Defined in <xf_fintech/tree_lattice.hpp>
1DMesher
Defined in <xf_fintech/fdmmesher.hpp>
OrnsteinUhlenbeckProcess
Defined in <xf_fintech/ornstein_uhlenbeck_process.hpp>
StochasticProcess1D
Defined in <xf_fintech/stochastic_process.hpp>
HWModel
Defined in <xf_fintech/hw_model.hpp>
G2Model
Defined in <xf_fintech/g2_model.hpp>
ECIRModel
Defined in <xf_fintech/ecir_model.hpp>
CIRModel
Defined in <xf_fintech/cir_model.hpp>
VModel
Defined in <xf_fintech/v_model.hpp>
HestonModel
Defined in <xf_fintech/heston_model.hpp>
BKModel
Defined in <xf_fintech/bk_model.hpp>
BSModel
Defined in <xf_fintech/bs_model.hpp>
PCA
Defined in <xf_fintech/pca.hpp>
BicubicSplineInterpolation
Defined in <xf_fintech/bicubic_spline_interpolation.hpp>
CubicInterpolation
Defined in <xf_fintech/cubic_interpolation.hpp>
BinomialDistribution
Defined in <xf_fintech/binomial_distribution.hpp>
bernoulliPMF
bernoulliCDF
covCoreMatrix
covCoreStrm
covReHardThreshold
covReSoftThreshold
covReBand
covReTaper
gammaCDF
svd
linearImpl
mcSimulation
normalPDF
normalCDF
normalICDF
logNormalPDF
logNormalCDF
logNormalICDF
pentadiagCr
poissonPMF
poissonCDF
poissonICDF
polyfit
polyval
polyint
polyder
trap_integrate
simp_integrate
romberg_integrate
boxMullerTransform
inverseCumulativeNormalPPND7
inverseCumulativeNormalAcklam
trsvCore
L2 Kernel User Guide
Pricing Engine Overview
Pricing Engine Kernel Design
Internal Design of European Option Pricing Engine
Overview
Implementation
Internal Design of MCEuropeanHestonEngine
Overview
Implementation
Variations
Internal Design of Asian Option Pricing Engine
Overview
MCAsianAPEngine
MCAsianASEngine
MCAsianGPEngine
Profiling
Internal Design of Digital Option Pricing Engines
Overview
Implementation
Internal Design of Barrier Option Pricing Engine
Overview
Implementation
MCBarrierEngine
MCBarrierNoBiasEngine
Internal Design of Cliquet Option Pricing Engine
Overview
Implementation
Internal Design of American Option Pricing Engine
Overview
Theory
Implementation
Calibration Process
Pricing Process
MCAmericanEnginePricing
MCAmericanEngine APIs
Internal Design of MCMultiAssetEuropeanHestonEngine
Overview
Implementation
Variations
Internal Design of MCHullWhiteCapFloorEngine
Overview
Implementation
Internal Design of MCEuropeanHestonGreeksEngine
Overview
Implementation
Internal Design of Closed Form Black-Scholes-Merton
Overview
Design Structure
cfBSMEngine (cf_bsm.hpp)
bsm_kernel (bsm_kernel.cpp)
Theoretical throughput
Resource Utilization
Throughput
Internal Design of Closed Form Black-76
Overview
Design Structure
cfB76Engine (cf_b76.hpp)
b76_kernel (b76_kernel.cpp)
Theoretical throughput
Resource Utilization
Throughput
Internal Design of Closed Form Heston
Overview
Design Structure
The Engine (hcf_engine.hpp)
IO Wrapper (hcf_kernel.cpp)
Resource Utilization
Throughput
Internal Design of Closed Form Merton 76
Overview
Design Structure
The Engine (M76Engine.hpp)
IO Wrapper (m76_kernel.cpp)
Resource Utilization
Throughput
Internal Design of Garman Kohlhagen
Overview
Design Structure
The Engine
IO Wrapper (gk_kernel.cpp)
Resource Utilization
Throughput
Internal Design of Quanto
Overview
Design Structure
The Engine
IO Wrapper (quanto_kernel.cpp)
Resource Utilization
Throughput
Internal Design of Cox-Ross-Rubinstein Binomial Tree
Overview
Design Structure
bt_engine (bt_engine.hpp)
binomialtreekernel (binomialtreekernel.cpp)
Resource Utilization
Throughput
Internal Design of Finite-Difference Hull-White Bermudan Swaption Pricing Engine
Overview
Implementation
Mesher
Differential operator
Evolution scheme
Profiling
Internal Design of Finite-Difference G2 Bermudan Swaption Pricing Engine
Overview
Implementation
Mesher
Differential operator
Profiling
Internal Design of Tree Bermudan Swaption Engine
Overview
Implemention
Profiling
Internal Design of CPI CapFloor Engine
Overview
Implemention
Profiling
Internal Design of Inflation CapFloor Engine
Overview
Implemention
Profiling
Internal Design of Zero Coupon Bond Engine
Overview
Implementation
Profiling
limitations under the License.
Overview
Design Structure
PCA HJM Kernel
MC HJM Kernel
Pricer Algorithms
limitations under the License.
Overview
Design Structure
Calibration of the Model
Correlation calibration
Volatility Calibration
Pricing Algorithms
Cap Pricing
Ratchet Floater Pricing
Ratchet Cap Pricing
Internal Architecture
Internal Design of HWA Engine
Implementation
Zero Coupon Bond Price
Equity Option Pricing
Cap/Floor
Implementation
Kernel
Host
Internal Design of CDS Engine
Implementation
Host
Kernel
Internal Design of Black Scholes Local Volatility Solver
Overview
Mathematical Background
Design Details
Test Methodology
Pricing Engine Kernel APIs
CPICapFloorEngine
Defined in <xf_fintech/cpi_capfloor_engine.hpp>
DiscountingBondEngine
Defined in <xf_fintech/discounting_bond_engine.hpp>
InflationCapFloorEngine
Defined in <xf_fintech/inflation_capfloor_engine.hpp>
FdHullWhiteEngine
Defined in <xf_fintech/fd_hullwhite_engine.hpp>
FdG2SwaptionEngine
Defined in <xf_fintech/fd_g2_swaption_engine.hpp>
binomialTreeEngine
cfB76Engine
cfBSMEngine
FdBsLvSolver
FdDouglas
hcfEngine
hjmPcaEngine
hjmMcEngine
hjmEngine
lmmEngine
M76Engine
MCEuropeanEngine
MCEuropeanPriBypassEngine
MCEuropeanHestonEngine
MCMultiAssetEuropeanHestonEngine
MCAmericanEnginePreSamples
MCAmericanEngineCalibrate
MCAmericanEnginePricing
MCAmericanEngine
MCAsianGeometricAPEngine
MCAsianArithmeticAPEngine
MCAsianArithmeticASEngine
MCBarrierNoBiasEngine
MCBarrierEngine
MCCliquetEngine
MCDigitalEngine
MCEuropeanHestonGreeksEngine
MCHullWhiteCapFloorEngine
McmcCore
treeSwaptionEngine
treeSwapEngine
treeCapFloorEngine
treeCallableEngine
Other Engine Kernel Design
Internal Design of Markov Chain Monte Carlo
Overview
The Engine (pop_mcmc.h)
Resource Utilization
Benchmark
Application Scenario
Performance
Test Overview
Vitis Quantitative_Finance Library
Vitis Security Library
Tutorial
Crypto Algorithm Hardware Acceleration
How Vitis Security Library Works
L1 API
Target Audience and Major Features
Input / output interface
Command to Run L1 cases
Introduction
Overview
Requirements
Software Platform
PCIE Accelerator Card
License
Trademark Notice
Release Note
Known Issue
2022.2
2021.2
2021.1
2020.2
2019.2
User Guide
Design Internals
Adler32
Overview
Implementation on FPGA
AES Encryption Algorithms
Original Implementation
Optimized Implementation on FPGA
AES-128 Encryption Performance (Device: U250)
AES-192 Encryption Performance (Device: U250)
AES-256 Encryption Performance (Device: U250)
BLAKE2 Algorithms
Overview
Implementation on FPGA
Performance
BLAKE2B
CBC Mode
Overview
Implementation on FPGA
Profiling
CBC-DES encryption
CBC-DES decryption
CBC-AES128 encryption
CBC-AES128 decryption
CBC-AES192 encryption
CBC-AES192 decryption
CBC-AES256 encryption
CBC-AES256 decryption
CCM Mode
Overview
Implementation on FPGA
Profiling
CCM-AES128 encryption
CCM-AES128 decryption
CCM-AES192 encryption
CCM-AES192 decryption
CCM-AES256 encryption
CCM-AES256 decryption
CFB Mode
Overview
Implementation on FPGA
Profiling
CFB1-DES encryption
CFB1-DES decryption
CFB1-AES128 encryption
CFB1-AES128 decryption
CFB1-AES192 encryption
CFB1-AES192 decryption
CFB1-AES256 encryption
CFB1-AES256 decryption
CFB8-DES encryption
CFB8-DES decryption
CFB8-AES128 encryption
CFB8-AES128 decryption
CFB8-AES192 encryption
CFB8-AES192 decryption
CFB8-AES256 encryption
CFB8-AES256 decryption
CFB128-DES encryption
CFB128-DES decryption
CFB128-AES128 encryption
CFB128-AES128 decryption
CFB128-AES192 encryption
CFB128-AES192 decryption
CFB128-AES256 encryption
CFB128-AES256 decryption
Chacha20 Algorithms
Implementation
Performance (Device: U250)
CRC32
Overview
Implementation on FPGA
CTR Mode
Overview
Implementation on FPGA
Profiling
CTR-AES128 encryption
CTR-AES128 decryption
CTR-AES192 encryption
CTR-AES192 decryption
CTR-AES256 encryption
CTR-AES256 decryption
DES and 3DES Algorithms
Algorithm Flow
Optimized Implementation on FPGA
Performance (Device: VU9P)
DES encryption
DES decryption
3DES encryption
3DES decryption
Digital Signature Algorithm
Implementation
Optimized Implementation on FPGA
Reference
ECB Mode
Overview
Implementation on FPGA
Profiling
ECB-DES encryption
ECB-DES decryption
ECB-AES128 encryption
ECB-AES128 decryption
ECB-AES192 encryption
ECB-AES192 decryption
ECB-AES256 encryption
ECB-AES256 decryption
Elliptic-curve Cryptography
Elliptic-curve Cryptography
Elliptic Curve Digital Signature Algorithm
Edwards-curve Digital Signature Algorithm
Reference
ECDSA nistp256
Overview
Implementation on FPGA
Signing (point multiplication kG)
Verification (point multiplication aG+bP)
ECDSA secp256k1
Overview
Implementation on FPGA
Signing (point multiplication kG)
Verification (point multiplication aG+bP)
ECDSA nistp384
Overview
Implementation on FPGA
Signing (point multiplication kG)
Verification (point multiplication aG+bP)
GCM Mode
Overview
Implementation on FPGA
Profiling
GCM-AES128 encryption
GCM-AES128 decryption
GCM-AES192 encryption
GCM-AES192 decryption
GCM-AES256 encryption
GCM-AES256 decryption
GMAC
Overview
Implementation on FPGA
Profiling
GMAC-AES128
GMAC-AES192
GMAC-AES256
HMAC Algorithms
Configuration
Implementation
Performance (Device: U250)
AES Decryption Algorithms
Original Implementation
Optimized Implementation on FPGA
AES-128 Decryption Performance (Device:U250)
AES-192 Decryption Performance (Device:U250)
AES-256 Decryption Performance (Device:U250)
Keccak-256 Algorithms
Overview
Implementation on FPGA
The MD4/MD5 Message-Digest Algorithms
Overview
Implementation on FPGA
Performance
MD4
MD5
OFB Mode
Overview
Implementation on FPGA
Profiling
OFB-DES encryption
OFB-DES decryption
OFB-AES128 encryption
OFB-AES128 decryption
OFB-AES192 encryption
OFB-AES192 decryption
OFB-AES256 encryption
OFB-AES256 decryption
Poly1305 Algorithm
Overview
Implementation on FPGA
Performance
RC4 Algorithms
Implementation
Performance (Device: U250)
RSA Cryptography
Implementation
Optimized Implementation on FPGA
Reference
SHA-1 Algorithm
Overview
Implementation on FPGA
Performance
SHA-2 Algorithms
Overview
Implementation on FPGA
Performance
SHA-224 and SHA-256
SHA-384, SHA-512, SHA-512/224, and SHA-512/256
Clustering
SHA-3 Algorithms
Overview
Implementation on FPGA
Performance
SHA3-224
SHA3-256
SHA3-384
SHA3-512
SHAKE-128
SHAKE-256
Clustering
SM2
SM2
SM3
SM4
Verifiable Delay Function
Overview
Implementation on FPGA
XTS mode
Overview
Implementation on FPGA
Profiling
XTS-AES128 encryption
XTS-AES128 decryption
XTS-AES256 encryption
XTS-AES256 decryption
Poseidon Hash Algorithm
Overview
Implementation on FPGA
API Functions of ``xf::security``
adler32
adler32 overload (1)
adler32 overload (2)
adler32 overload (3)
blake2b
blake2b overload (1)
desCbcEncrypt
desCbcDecrypt
aes128CbcEncrypt
aes128CbcDecrypt
aes192CbcEncrypt
aes192CbcDecrypt
aes256CbcEncrypt
aes256CbcDecrypt
aes128CcmEncrypt
aes128CcmDecrypt
aes192CcmEncrypt
aes192CcmDecrypt
aes256CcmEncrypt
aes256CcmDecrypt
desCfb1Encrypt
desCfb1Decrypt
aes128Cfb1Encrypt
aes128Cfb1Decrypt
aes192Cfb1Encrypt
aes192Cfb1Decrypt
aes256Cfb1Encrypt
aes256Cfb1Decrypt
desCfb8Encrypt
desCfb8Decrypt
aes128Cfb8Encrypt
aes128Cfb8Decrypt
aes192Cfb8Encrypt
aes192Cfb8Decrypt
aes256Cfb8Encrypt
aes256Cfb8Decrypt
desCfb128Encrypt
desCfb128Decrypt
aes128Cfb128Encrypt
aes128Cfb128Decrypt
aes192Cfb128Encrypt
aes192Cfb128Decrypt
aes256Cfb128Encrypt
aes256Cfb128Decrypt
chacha20
xchacha20
crc32
crc32 overload (1)
crc32 overload (2)
crc32 overload (3)
crc32c
crc32c overload (1)
crc32c overload (2)
aes128CtrEncrypt
aes128CtrDecrypt
aes192CtrEncrypt
aes192CtrDecrypt
aes256CtrEncrypt
aes256CtrDecrypt
desEncrypt
desDecrypt
des3Encrypt
des3Decrypt
desEcbEncrypt
desEcbDecrypt
aes128EcbEncrypt
aes128EcbDecrypt
aes192EcbEncrypt
aes192EcbDecrypt
aes256EcbEncrypt
aes256EcbDecrypt
nistp256Sign
nistp256Verify
aes128GcmEncrypt
aes128GcmDecrypt
aes192GcmEncrypt
aes192GcmDecrypt
aes256GcmEncrypt
aes256GcmDecrypt
aes128Gmac
aes192Gmac
aes256Gmac
hmac
keccak_256
md4
md5
desOfbEncrypt
desOfbDecrypt
aes128OfbEncrypt
aes128OfbDecrypt
aes192OfbEncrypt
aes192OfbDecrypt
aes256OfbEncrypt
aes256OfbDecrypt
poly1305
poly1305MultiChan
rc4
sha1
sha224
sha256
sha3_224
sha3_256
sha3_384
sha3_512
shake128
shake256
sha384
sha512
sha512_t
HMAC_SHA384
HMAC_SHA384 overload (1)
HMAC_SHA384 overload (2)
sm3
evaluate
verifyWesolowski
verifyPietrzak
aes128XtsEncrypt
aes128XtsDecrypt
aes256XtsEncrypt
aes256XtsDecrypt
Inherited Members
Inherited Members
updateKey
updateKey overload (1)
updateKey overload (2)
process
Fields
updateSigningParam
updateSigningParam overload (1)
updateSigningParam overload (2)
updateVerifyingParam
updateVerifyingParam overload (1)
updateVerifyingParam overload (2)
sign
verify
Benchmark
Test Overview
Vitis Solver Library
Introduction
Overview
PL Solver library
AI Engine Solver library
Requirements
Software requirements
Hardware requirements
License
Trademark Notice
Release Note
2024.1
2023.2
2023.1
2022.1
PL Solver Library User Guide
Vitis Solver Library Tutorial
How Vitis Solver Library Works
L2 API
Target Audience and Major Features
Command to Run L2 cases
L1 API
Target Audience and Major Features
Command to Run L1 cases
L1 PL User Guide
APIs
backSubstitute
cholesky
choleskyInverse
matrixMultiply
matrixMultiply overload (1)
matrixMultiply overload (2)
qrInverse
qrd_cfloat_core
qrd_float_core
qrf
svd
Core Utility
QRD (QR Decomposition)
Overview
Implementation
DataType Supported
Interfaces
Implementation Controls
Specifications
Key Factors
QRF (QR Factorization)
Overview
Implementation
DataType Supported
Interfaces
Implementation Controls
Specifications
Key Factors
QR_Inverse
Overview
Implementation
DataType Supported
Interfaces
Implementation Controls
Specifications
SVD (Singular Value Decomposition)
Overview
Implementation
DataType Supported
Interfaces
Implementation Controls
Specifications
Key Factors
Cholesky_Inverse
Overview
Implementation
DataType Supported
Interfaces
Implementation Controls
Specifications
Cholesky
Overview
Implementation
DataType Supported
Interfaces
Implementation Controls
Specifications
Key Factors
L2 PL User Guide
Supported Numerical Methods
Singular Value Decomposition for symmetric matrix (GESVDJ)
Overview
Theory
Jacobi Methods
Singular Value Decomposition for general matrix (GESVJ)
Overview
Algorithm
Architecture
General QR Decomposition (GEQRF)
Lower-Upper Decomposition (GETRF)
Lower-Upper Decomposition (GETRF_NOPIVOT)
Cholesky Decomposition for SPD matrix (POTRF)
Triangular Solver (GTSV)
Symmetric Linear Solver (POLINEARSOLVER)
Symmetric Matrix Inverse (POMATRIXINVERSE)
General Linear Solver (GELINEARSOLVER)
General Matrix Inverse (GEMATRIXINVERSE)
Triangular Solver with multiple right-hand sides (TRTRS)
Eigenvalue Solver (SYEVJ)
L2 PL APIs
Matrix Decomposition
geqrf
geqrf overload (1)
gesvdj
gesvj
getrf
getrf_nopivot
potrf
Linear Solver
gelinearsolver
gematrixinverse
gtsv
polinearsolver
pomatrixinverse
trtrs
Eigenvalue Solver
syevj
Benchmark
Datasets
Performance
Test Overview
AIE Solver Library User Guide
Introduction
Code Organization
Using Library Elements within Defined Graphs
Compiling and Simulation Using the Makefile
AIE APIs Design Information
Cholesky Decomposition
Introduction
Entry Point
Data Format
Template Parameters
Ports
AIE Kernel
Design Notes
Kernel Interfaces
Performance
Test_1
Test_2
QR Decomposition
Introduction
Entry Point
Template Parameters
Ports
AIE Graph
Design Notes
Graph Interfaces
QR Decomposition with Householder
Introduction
Entry Point
Template Parameters
Ports
AIE Graph
Design Notes
Graph Interfaces
Singular value decomposition
Introduction
Entry Point
Template Parameters
Ports
AIE Graph
Design Notes
Graph Interfaces
Pseudoinverse
Introduction
Entry Point
Template Parameters
Ports
AIE Graph
Design Notes
Graph Interfaces
AIE APIs
template class xf::solver::CholeskyGraph
Overview
Fields
Methods
CholeskyGraph
template class xf::solver::QRDComplexFloat
Overview
Fields
Methods
QRDComplexFloat
template class xf::solver::QRD_Householder_Graph
Overview
Fields
Methods
QRD_Householder_Graph
template class xf::solver::SVDComplexFloat
class xf::solver::PseudoInverseComplexFloat
template class xf::solver::LSTQR_Graph
Overview
Fields
Methods
LSTQR_Graph
template class xf::solver::BackSubstitution_Graph
Overview
Fields
Methods
BackSubstitution_Graph
Vitis SPARSE Library
Introduction
Overview
Requirements
Software Platform
PCIE Accelerator Card
License
Trademark Notice
Release Note
2020.1
2021.1
User Guide
L1 Primitives User Guide
Primitive Overview
1. Scatter-gather logic
2. Row-wise accumulator
3. Buffer and distribute input column vector entries and the column pointers of NNZs
Primitive Implementation Details
Scatter-Gather Logic Implementation
Row-wise Accumulator Implementation
Column Vector Buffering and Distribution Implementation
API Functions of ``xf::sparse``
Terms and Conditions.
API Functions of xf::sparse
L2 Kernel User Guide
CSCMV Overview
1. Matrix partitioning and device memory layout
1. The functionality of the CUs
1. Build and test the design
Double Precision SpMV Overview
1. Matrix partitioning
1. Build and test the design
CSCMV Kernel APIs
Double Precision SPMV Kernel APIs
Benchmark Result
SPMV (Double precision)
Dataset
Executable Usage
Profiling
Vitis Ultrasound Library
Introduction
Overview
Requirements
Software Platform
AIE Card
License
Trademark Notice
Release Note
2022.2
2023.1
2023.2
Features
Features for Ultrasound Library Release
Code structures enhancement
Host code enhancement
Support for AIE full verification flow on VCK190 platform
Details for Ultrasound Library L1
Ultrasound Library - Level 1 (L1)
kernel name: kernel_imagepoints
kernel name: kernel_focusing
Kernel name: kfun_apodization_preprocess
Kernel name: kfun_apodization_main
Kernel name: kernel_delay
Kernel name: kernel_samples
Kernel name: kfun_rfbuf_wrapper
Kernel name: kfun_resamp_wrapper
Kernel name: kfun_genwin_wrapper
Kernel name: kfun_interpolation_wrapper
kernel name: kfun_mult_pre
kernel name: kfun_mult_cascade
Kernel name: absV
Kernel name: cosV
Kernel name: diffMV
Kernel name: diffSV
Kernel name: diffVS
Kernel name: divVS
Kernel name: equalS
Kernel name: lessOrEqualThanS
Kernel name: mulMM
Kernel name: mulVS
Kernel name: mulVV
Kernel name: norm_axis_1
Kernel name: ones
Kernel name: outer
Kernel name: reciprocalV
Kernel name: sqrtV
Kernel name: squareV
Kernel name: sum_axis_1
Kernel name: sumMM
Kernel name: sumVS
Kernel name: sumVV
Kernel name: tileV
Details for Ultrasound Library L2
Ultrasound Library - Level 2 (L2)
Graph name: graph_imagepoints
Graph name: graph_focusing
Graph name: graph_apodization_preprocess
Graph name: graph_apodization
Graph name: graph_delay
Graph name: graph_samples
Graph name: graph_interpolation
Graph name: graph_mult
Graph name: graph_scanline
Graph name: Image Points
Graph name: Delay
Graph name: Delay_PW
Graph name: Focusing
Graph name: Focusing_SA
Graph name: Samples
Graph name: Apodization
Graph name: Apodization_SA
Graph name: bSpline
Details for Ultrasound Library L3
Ultrasound Library - Level 3 (L3)
Scanline_AllinAIE Beamformer
ScanLine Beamformer
PW Beamformer
SA Beamformer
Tutorial
Lab-1: How does Vitis Ultrasound Library work
Setup Environment
Download the Vitis Ultrasound Library
Lab-2: L1/L2 Graph based algorithm acceleration and evaluation for ultrasound tool box case
Lab purpose
Run a L1 Example
Run a L2 Example
L2 APIs Input Arguments
Lab-3: L2 Graph based algorithm acceleration and evaluation for ultrasound All in AIE case
Run a L2 graph_scanline case
Example logs of graph_scanline
Lab-4: L3 Graph based acceleration for ultrasound All in AIE, integrated with PL and xrt case
Lab purpose
Run L3 All in AIE cases
L3 APIs Input Arguments
Example logs of scanline_AllinAIE
Example logs of plane_wave
Resources
Vitis Utilities Library
Introduction
Overview
Overview
License
Trademark Notice
Tutorial
How Vitis Utils Library Works
HLS Hardware Utiliy API
Target Audience and Major Features
Command to Run cases
Release Note
2023.2
2023.1
2022.2
2021.2
2021.1
2020.2
2020.1
2019.2
Requirements
Software Platform
Development Tools
Design Flows
Shell Environment
HLS Cases Command Line Flow
Utility User Guide
Stream-Based API Design
Stream-based Interface
API Functions of ``xf::common::utils_hw``
axiToMultiStream
axiToStream
axiToCharStream
axiToStream
makeMux
streamCombine
streamCombine overload (1)
streamCombine overload (2)
streamCombine overload (3)
streamCombine overload (4)
streamDiscard
streamDiscard overload (1)
streamDiscard overload (2)
streamDiscard overload (3)
streamDup
streamDup overload (1)
streamDup overload (2)
streamNToOne
streamNToOne overload (1)
streamNToOne overload (2)
streamNToOne overload (3)
streamNToOne overload (4)
streamNToOne overload (5)
streamNToOne overload (6)
streamOneToN
streamOneToN overload (1)
streamOneToN overload (2)
streamOneToN overload (3)
streamOneToN overload (4)
streamOneToN overload (5)
streamOneToN overload (6)
streamReorder
streamShuffle
streamSplit
streamSplit overload (1)
streamSplit overload (2)
streamSync
streamToAxi
API Class of ``xf::common::utils_hw``
API Class of xf::common::utils_hw
template class xf::common::utils_hw::UramArray
Overview
Methods
memSet
write
read
template class xf::common::utils_hw::cache
Overview
Methods
initSingleOffChip
initDualOffChip
readOnly
readOnly overload (1)
readOnly overload (2)
readOnly overload (3)
readOnly overload (4)
template class xf::common::utils_hw::Multiplexer
Overview
Methods
get
get overload (1)
get overload (2)
put
API Class of xf::common::utils_sw
class xf::common::utils_sw::ArgParser
Overview
Methods
addFlag
addOption
getAs
showUsage
Template Helpers in ``xf::common::utils_hw``
template struct xf::common::utils_hw::PowerOf2
Overview
Fields
template struct xf::common::utils_hw::GCD <_A, 0>
template struct xf::common::utils_hw::LCM
Overview
Fields
Tag Types in ``xf::common::utils_hw``
struct xf::common::utils_hw::LoadBalanceT
struct xf::common::utils_hw::RoundRobinT
struct xf::common::utils_hw::TagSelectT
struct xf::common::utils_hw::LSBSideT
struct xf::common::utils_hw::MSBSideT
Module Design Internals
Internals of axiToStream
Internals of axiToMultiStream
Internals of streamToAxi
Internals of UramArray
Work Flow
Storage Layout
Resources
Internals of streamOneToN
Round-Robin
Generic Type
Vector Input
Load-Balancing
Generic Type
Vector Input
Tag-Select
Internals of streamNToOne
Round-Robin
Generic Type
Vector Output
Load-Balancing
Generic Type
Vector Output
Tag-Select
Internals of streamDiscard
Internals of streamSplit
Internals of streamCombine
Internals of streamSync
Internals of streamReorder
Examples
Vitis Vision Library
Vitis Vision Library User Guide
Overview
Basic Features
Vitis Vision Kernel on Vitis
Vitis Vision Library Contents
Getting Started with Vitis Vision
Prerequisites
Vitis Design Methodology
Host Code with OpenCL
Wrappers around HLS Kernel(s)
Stream Based Kernels
Array2xfMat
xfMat2Array
Interface Pointer Widths
Kernel-to-Kernel Streaming
axiStrm2xfMat
xfMat2axiStrm
Memory Mapped Kernels
Makefile
Design Example using Library on Vitis
Host Code
Top Level Kernel
Evaluating the Functionality
Using the Vitis Vision Library
Changing the Hardware Kernel Configuration
Using the Vitis Vision Library Functions on Hardware
Getting Started with HLS
AXI Video Interface Functions
AXIvideo2xfMat
xfMat2AXIvideo
cvMat2AXIvideoxf
AXIvideo2cvMatxf
Migrating the HLS Video Library to Vitis vision
Design Examples Using the Vitis Vision Library
Iterative Pyramidal Dense Optical Flow
Corner Tracking Using Optical Flow
cornerUpdate()
cornersImgToList()
Image Processing
Color Detection
Defect Detection Pipeline
pass_2()
Difference of Gaussian Filter
Stereo Vision Pipeline
Blob From Image
Letterbox
Image Sensor Processing pipeline
Image Sensor Processing Pipeline with HDR
Image Sensor Processing pipeline with GTM
Mono Image Sensor Processing pipeline
RGB-IR image Sensor Processing Pipeline
Image Sensor Processing Multistream Pipeline
ISP all_in_one_adas Pipeline
Create and Launch Kernel in the Testbench:
ISP all_in_one Pipeline:
Create and Launch Kernel in the Testbench:
ISP 24-bit Pipeline
Vitis Vision AIE Library User Guide
Overview
Basic Features
Vitis Vision AIE Library Contents
Getting Started with Vitis Vision AIE
AIE Prerequisites
Vitis AIE Design Methodology
Prepare the Kernels
Data Flow Graph Construction
Setting up Platform Ports
PLIO
GMIO
Host Code Integration
x86Simulation / AIE Simulation
HW Emulation / HW Run
xfcvDataMovers
Evaluating the Functionality
x86 Simulation
AIE Simulation
HW Emulation
Testing on HW
Design example Using Vitis Vision AIE Library
ADF Graph
Platform Ports
Host code
Makefile
Vitis Vision Library API Reference
Overview
xf::cv::Mat Image Container Class
Class Definition
Pixel-Level Parallelism
Macros to Work With Parallelism
Data Types
Manipulating Data Type
xf::cv::imread
xf::cv::imwrite
xf::cv::absDiff
xf::cv::convertTo
Vitis Vision Library Functions
Absolute Difference
Accumulate
Accumulate Squared
Accumulate Weighted
AddS
Add Weighted
Auto Exposure Correction
Auto White Balance
Bad Pixel Correction
Brute-force (Bf) Feature Matcher
Bilateral Filter
Bit Depth Conversion
Bitwise AND
Bitwise NOT
Bitwise OR
Bitwise XOR
Blacklevelcorrection
Box Filter
BoundingBox
Canny Edge Detection
Channel Combine
Channel Extract
Clahe
Color Conversion
RGB to YUV Conversion Matrix
YUV to RGB Conversion Matrix
RGBA/RGB to YUV4
RGBA/RGB to IYUV
RGBA to NV12
RGBA to NV21
YUYV to RGBA
YUYV to NV12
YUYV to IYUV
UYVY to IYUV
UYVY to NV12
IYUV to RGBA/RGB
IYUV to NV12
IYUV to YUV4
NV12 to IYUV
NV12 to RGBA
NV12 to YUV4
NV21 to IYUV
NV21 to RGBA
NV21 to YUV4
RGB to GRAY
BGR to GRAY
GRAY to RGB
GRAY to BGR
RGB to XYZ
BGR to XYZ
RGB/BGR to HSV
RGB/BGR to HLS
YCrCb to RGB/BGR
HSV to RGB/BGR
NV12/NV21 to RGB/ BGR
NV122RGB:
NV122BGR:
NV212RGB:
NV212BGR:
NV12 to NV21/NV21 to NV12
NV122NV21:
NV212NV12:
NV12/NV21 to UYVY/YUYV
NV122UYVY:
NV122YUYV:
NV212UYVY:
NV212YUYV:
UYVY/YUYV to RGB/BGR
YUYV2RGB:
YUYV2BGR:
UYVY2RGB
UYVY2BGR:
UYVY to YUYV/ YUYV to UYVY
UYVY2YUYV:
YUYV2UYVY:
UYVY/YUYV to NV21
UYVY2NV21:
YUYV2NV21:
RGB/ BGR to NV12/NV21
RGB2NV12
BGR2NV12
RGB2NV21
BGR2NV21
BGR to RGB / RGB to BGR
RGB/BGR to UYVY/YUYV
RGB to UYVY:
RGB to YUYV:
BGR to UYVY:
BGR to YUYV:
XYZ to RGB/BGR
Color correction matrix
Color Thresholding
Compare
CompareS
convertScaleAbs
Crop
Multiple ROI Extraction
Multiple ROI Extraction Example
CUSTOM BGR2Y8
Custom CCA
Custom Convolution
Delay
Degamma
Demosaicing
Dilate
Distance Transform Feature Matcher
Duplicate
Erode
FAST Corner Detection
Gaincontrol
Extract Exposure Frames
Flip
Gamma Correction
Global Tone Mapping
HDR Decompanding
HDR Merge
Gaussian Filter
Gradient Magnitude
Gradient Phase
Harris Corner Detection
Non-Maximum Suppression:
Threshold:
Histogram Computation
Histogram Equalization
HOG
HoughLines
Preprocessing for Deep Neural Networks
Pyramid Up
Pyramid Down
InitUndistortRectifyMapInverse
InRange
Integral Image
ISP Stats
Dense Pyramidal LK Optical Flow
Dense Non-Pyramidal LK Optical Flow
Kalman Filter
Extended Kalman Filter
Example for Extended Kalman Filter
Laplacian Operator
Lens Shading Correction
Local Tone Mapping
Look Up Table
Mean and Standard Deviation
Max
MaxS
Median Blur Filter
Min
MinS
MinMax Location
Mean Shift Tracking
Mode filter
Otsu Threshold
Paint Mask
Pixel-Wise Addition
Pixel-Wise Multiplication
Pixel-Wise Subtraction
Quantization & Dithering
Reduce
Remap
Resolution Conversion (Resize)
RGBIR to Standard Bayer Format
Rotate
BGR to HSV Conversion
Scharr Filter
Set
Sobel Filter
Semi Global Method for Stereo Disparity Estimation
Stereo Local Block Matching
SubRS
SubS
Sum
SVM
3D LUT
Thresholding
Atan2
Inverse (Reciprocal)
Square Root
TVL1 Optical Flow
Warp Transform
Zero
Vitis Vision AIE Library API Reference
Class Definition
Vitis Vision AIE-ML Library Functions API list
Benchmark
Canny Edge Detection
Executable Usage
Profiling
Harris Corner Detection
Executable Usage
Profiling
FAST Corner Detection
Executable Usage
Profiling
Kalman Filter
Executable Usage
Profiling
Dense Pyramidal LK Optical Flow
Executable Usage
Profiling
Corner Tracker
Executable Usage
Profiling
Color Detect
Executable Usage
Profiling
Gaussian Difference
Executable Usage
Profiling
Bilateral Filter
Executable Usage
Profiling
Stereo Local Block Matching
Executable Usage
Profiling
Image Sensor Processing (ISP) Pipeline
Executable Usage
Profiling
Release Notes
Known issues
Versions
The loadCol
CU reads the input dense column vector and the NNZ column pointer entries from two physically separated DDR device memories DDR0 and DDR1 as shown in the preceding figure, and send them to the bufTransColVec
and bufTransNnzCol
CUs to buffer and select entries for each computation path connected to each HBM channel.
The bufTransColVec
CU reads the input dense vector entries that belong to each block, split them into chunks for each HBM channel, buffer all those chunks (16 in total in this design) and transmit the data to its corresponding xBarCol
CU.
The bufTransNnzCol
CU reads the column pointer entries that belong to each block, split them into chunks for each HBM channels, buffer all those chunks (16 in total in this design) and transmit the data to its corresponding xBarCol
CU.
The xBarCol
CUs, one for each HBM channel, select the input dense vector entries according to the NNZs’ column pointer entries and send the result to cscRow
CUs for computations.
Each cscRow
CU reads the value and row indices of NNZs from one HBM channel and multiplies the values with their corresponding column entries received from the connected xBarCol
CU, and accumulates the results along the row indices.
Each readWriteHbm
CU connects to eight HBM channels, and reads the NNZs’ value and row indices from those connected HBM channels and send the results to the corresponding cscRow
CUs. It also collects the results from eight cscRow
CUs and writes them back to the corresponding HBM channels. In total, two readWriteHbm
CUs are used to connect to 16 HBM channels.