Design Notes - 2023.2 English
Vitis Libraries
Release Date
2023-12-20
Version
2023.2 English
Vitis Libraries
Release Information
Vitis Data Analytics Library
Known Issues
Vitis DSP Library
Vitis Motor Control Library
Vitis Quantitative Finance Library
Vitis Solver Library
Vitis Ultrasound Library
L1
L2
L3
Vitis Utility Library
Vitis Vision Library
New Features and Functions
PL additions/enhancements
AIE additions/enhancements
Known Issues
Developer Guide
Compilation and Execution
Set Up the Environment
HLS Cases Command Line Flow
Run a L1 Example
Vitis Cases Command Line Flow
Run a L2 Example
Run a L3 Example
Executing Vitis Library in the Vitis IDE
Download the Library Template in the IDE
Create the L2 Application Library Project in the IDE
Create the L3 Application Library Project in the IDE
Libraries
Vitis BLAS Library
Introduction
Overview
Requirements
Software Platform
PCIE Accelerator Card
License
Trademark Notice
Release Note
2020.1
2021.1
User Guide
L1 Primitives User Guide
L1 API Overview
Introduction
1. Introduction
2. L1 primitives’ usage
3. Matrix storage used in L1 primitives
L1 Compute APIs
amax
amin
asum
axpy
copy
dot
gbmv
gemv
nrm2
scal
swap
symv
trmv
L1 Data mover
1. Matrix storage format
2. Data mover APIs
sbmSuper2Stream
sbmSub2Stream
gbm2Stream
vec2GbMatStream
tbmSuper2Stream
tbmSub2Stream
vec2TbUpMatStream
vec2TbLoMatStream
gem2Stream
vec2GemStream
symUp2Stream
symLo2Stream
spmUp2Stream
spmLo2Stream
vec2SymStream
trmUp2Stream
trmLo2Stream
tpmUp2Stream
tpmLo2Stream
vec2TrmUpStream
vec2TrmLoStream
readVec2Stream
writeStream2Vec
L1 Test
Python Environment Setup Guide
L2 Kernels User Guide
L2 API Overview
Introduction
1. Introduction
2. L2 kernel usage
Blas Function Kernel
GEMM Kernel
Architecture
Systolic Array
Matrix Block Partition
Data Movers
Transpose
Double Buffers
L2 API benchmark
L2 GEMM benchmark
1. gemm_4CU
1.1 Executable Usage
1.1.1 Work Directory(Step 1)
1.1.2 Build kernel(Step 2)
1.1.3 Run kernel(Step 3)
1.1.4 Example output(Step 4)
1.2 Profiling
L2 GEMV benchmark
1. gemvStreamCh16
1.1 Executable Usage
1.1.1 Work Directory(Step 1)
1.1.2 Build kernel(Step 2)
1.1.3 Run kernel(Step 3)
1.1.4 Example output(Step 4)
1.2 Profiling for u280
1.3 Profiling for u50
L3 API User Guide
L3 API Overview
Introduction
1. Introduction
1.1 Data layout
1.2 Memory Allocation
Restricted memory version
Default memory version
Pre-allocated memory version
1.3 Supported Datatypes
2. Using the Vitis BLAS API
2.1 General description
2.1.1 Error status
2.1.2 Vitis BLAS initialization
2.2 Datatypes Reference
2.2.1 xfblasStatus_t
2.2.2 xfblasEngine_t
2.2.3 xfblasOperation_t
2.3 Vitis BLAS Helper Function Reference
2.3.1 xfblasCreate
2.3.2 xfblasFree
2.3.3 xfblasDestroy
2.3.4 xfblasMalloc
2.3.5 xfblasSetVector
2.3.6 xfblasGetVector
2.3.7 xfblasSetMatrix
2.3.8 xfblasGetMatrix
2.3.9 xfblasMallocRestricted
2.3.10 xfblasSetVectorRestricted
2.3.11 xfblasGetVectorRestricted
2.3.12 xfblasSetMatrixRestricted
2.3.13 xfblasGetMatrixRestricted
2.3.14 xfblasMallocManaged
2.3.15 xfblasExecute
2.3.16 xfblasExecuteAsync
2.3.17 xfblasGetByPointer
2.3.18 xfblasGetByAddress
2.4 Vitis BLAS Function Reference
2.4.1 xfblasGemm
3. Obtain FPGA bitstream
L3 API example
L3 API benchmark
L3 API GEMM benchmark
1. memKernel
1.1 Executable Usage
1.1.1 Work Directory(Step 1)
1.1.2 Build kernel(Step 2)
1.1.3 Run kernel(Step 3)
1.1.4 Example output(Step 4)
1.1.5 Use script to run benchmark
1.2 Profiling
2. streamingKernel
2.1 Executable Usage
2.1.1 Work Directory(Step 1)
2.1.2 Build kernel(Step 2)
2.1.3 Run kernel(Step 3)
2.1.4 Example output(Step 4)
2.1.5 Use script to run benchmark
2.2 Profiling
L3 API test
L3 Python bindings
1. Introduction
1.1 Set Python Environment
1.2 Build shared library
2. Using the Vitis BLAS L3 Python API
2.1 General description
2.1.1 Vitis BLAS initialization
2.2 Vitis BLAS Helper Function Reference
2.3 Using Python APIs
Python Environment Setup Guide
Benchmark
1. Performance
1.1 gemv
1.2 gemm
2. Benchmark Test Overview
2.1 Prerequisites
2.1.1 Vitis BLAS Library
2.2 Building
2.2.1 Download code
2.2.2 Setup environment
Vitis Codec Library
Introduction
Overview
Requirements
Software Platform
PCIE Accelerator Card
License
Trademark Notice
Release Note
2022.1
Vitis Codec Library Tutorial
Vitis Codec and Hardware Acceleration
Lab-1: How Vitis Codec Library Works
Get the Vitis Codec Library
Get the Dependencies
Setup Environment
Download the Vitis Graph Library
Command to Run L1 cases
Command to Run L2 cases
Lab-2: Using L1-level API to evaluate JPEG decoding acceleration
Lab purpose
Operation steps
(1) Learn about run_hls.tcl file
(2) CSIM:
(3) Synthesis:
(4) COSIM:
(5) Design with export
Lab summary
Lab-3: Using L2-level API to implement a single-kernel acceleration for JPEG decoding
Lab purpose
Operation steps
(1) Understand the Work Directory
(2) Build kernel for different modes
(3) Run kernel in Software-Emulation mode
(4) Run kernel in Hardware-Emulation mode
(5) Run kernel in Hardware
Lab summary
Lab-4: Using multi-kernel solution to accelerate WebP encoding based on open-source project
Lab purpose
Operation steps
(1) Open source project analysis and kernel partition
(2) Project files for multi-kernel design
(3) Software Emulation
(4) Hardware Emulation
(5) Hardware Build and Check Resource Consumption
(6) Hardware Running
Lab summary
Tutorial Summary
L1 User Guide
API Document
namespace codec
namespace details
mcu_decoder
hls_next_mcupos2
namespace internal
enum xf::codec::internal::Type
struct xf::codec::internal::HybridUint
enum xf::codec::COLOR_FORMAT
struct xf::codec::decOutput
struct xf::codec::hls_compInfo
struct xf::codec::hls_huff_DHT
struct xf::codec::hls_huff_segment
struct xf::codec::img_info
struct xf::codec::sos_data
Design Internals
kernelParserDecoderTop
JPEG Huffman Decoder
Executable Usage
Profiling
Internal Design of Order Tokenize
Overview
Implemention
Profiling
L2 User Guide
API Document
namespace codec
namespace details
struct xf::codec::details::hls_huff_DHT
struct xf::codec::details::hls_huff_segment
struct xf::codec::details::sos_data
template class xf::codec::details::BicubicInterpolator
enum xf::codec::COLOR_FORMAT
struct xf::codec::bas_info
struct xf::codec::cmp_info
struct xf::codec::img_info
Design Internals
JPEG Decoder
Overview
Algorithm
Implemention
Profiling
PIK Encoder
Internal Designs
Executable Usage
Profiling
Result
Lepton Encoder
Internal Designs
Software and system requirements
Building the accelerated Lepton encoder
Running the accelerated Lepton encoder
Performance
WebP Encoder
Implementation
Performance
Software and system requirements
Building the accelerated WebP encoder
Running the accelerated WebP encoder
Resize Down
Overview
Implementation
Interface
JXL Encoder
Overview
Executable Usage
Profiling
Result
Benchmark
JPEG Decoder
Executable Usage
Profiling
PIK Encoder
Executable Usage
Profiling
Result
Resize
Executable Usage
Profiling
Webp Encoder
Executable Usage
Profiling
JXL Encoder
Executable Usage
Profiling
Result
Vitis Database Library
Introduction
Overview
Overview
Generic Query Engine
License
Trademark Notice
Release Note
2022.2
2022.1
2021.2
2021.1
2020.2
2020.1
2019.2
Internal Release
Requirements
FPGA Accelerator Card
Software Platform
Development Tools
Dependency
Design Flows
Shell Environment
HLS Cases Command Line Flow
Vitis Cases Command Line Flow
Vitis Database Library Tutorial
Relational Database and Hardware Acceleration
How Vitis Database Library Works
L3 API – General Query Engine
Target Audience and Major Features
Example Usage
L2 API – GQE kernels
Target Audience and Major Features
Command to Run L2 cases
L1 API
Target Audience and Major Features
Command to Run L1 cases
User Guide
L1 Module User Guide
Primitive Overview
1. Stream-based Interface
2. Implementing Scan
3. Implementing Hash
4. Implementing Filter
5. Implementing Evaluation
6. Implementing Bloom Filter
7. Implementing Join
7-1. Join Implementation Summary
7-2. Hash-Join
7-3. Hash-Semi-Join
7-4. Hash-Anti-Join
7-5. Hash-Multi-Join
7-6. Merge-Join and Merge-Left-Join
7-7. Nested-Loop-Join
8. Implementing Group-by Aggregation
8-1. Sorted Rows Group-Aggregate
8-2. On-Chip Group-Aggregate
8-3. Off-chip Group-Aggregate
9. Implementing Hash Partition
10. Implementing Sort
10-1. Sort Implementation Summary
10-2. Bitonic-Sort
10-3. Insert-Sort
10-4. Merge-Sort
11. Glue Logic
11-1. Combine and Split Columns
Primitive APIs in ``xf::database``
aggregate
aggregate overload (1)
aggregate overload (2)
aggregate overload (3)
bitonicSort
bfGen
bfGenStream
bfCheck
combineCol
combineCol overload (1)
combineCol overload (2)
combineCol overload (3)
combineCol overload (4)
splitCol
splitCol overload (1)
splitCol overload (2)
splitCol overload (3)
splitCol overload (4)
compoundSort
directGroupAggregate
directGroupAggregate overload (1)
directGroupAggregate overload (2)
duplicateCol
dynamicEval
dynamicEvalV2
dynamicFilter
dynamicFilter overload (1)
dynamicFilter overload (2)
dynamicFilter overload (3)
dynamicFilter overload (4)
groupAggregate
groupAggregate overload (1)
groupAggregate overload (2)
groupAggregate overload (3)
groupAggregate overload (4)
hashAntiJoin
hashGroupAggregate
hashJoinMPU
hashJoinMPU overload (1)
hashJoinMPU overload (2)
hashJoinV3
hashBuildProbeV3
hashJoinV4
hashBuildProbeV4
hashLookup3
hashLookup3 overload (1)
hashLookup3 overload (2)
hashLookup3 overload (3)
hashMultiJoin
hashMultiJoinBuildProbe
hashMurmur3
hashMurmur3Hive
hashPartition
hashSemiJoin
insertSort
insertSort overload (1)
insertSort overload (2)
mergeJoin
mergeLeftJoin
mergeSort
mergeSort overload (1)
mergeSort overload (2)
nestedLoopJoin
scanCmpStrCol
scanCol
scanCol overload (1)
scanCol overload (2)
scanCol overload (3)
scanCol overload (4)
scanCol overload (5)
scanCol overload (6)
scanCol overload (7)
scanCol overload (8)
scanCol overload (9)
scanCol overload (10)
scanCol overload (11)
scanCol overload (12)
scanCol overload (13)
staticEval
staticEval overload (1)
staticEval overload (2)
staticEval overload (3)
staticEval overload (4)
Primitive Design Internals
Internals of Dynamic-Filter
Internal Structure
Limitations
Generating Config Bits
Internals of Dynamic-Evaluation
Internals of Lookup3 and Murmur3 Hash
Murmur3 and Lookup3 Hash Introduction
Acceleration of Murmur3 and Lookup3 Hash
Internals of Bloom-Filter
Internals of Group-Aggregate (Using Sorted Rows)
Internals of Direct-Group-Aggregate
Internals of Hash-Group-Aggregate (Generic Version)
Internals of Hash-Join (Multi-Process-Unit Version)
Internals of Hash-Join-v3 and Hash-Build-Probe-v3
Internals of Hash-Join-v4 and Hash-Build-Probe-v4
Internals of Hash-Semi-Join (Multi-Process-Unit Version)
Principle
Structure
Internals of Hash-Anti-Join
Internals of Hash-Multi-Join
Internals of Hash-Partition
Internals of Merge-Join and Merge-Left-Join
User guide
Structure
Internals of Nested-Loop-Join
User guide
Structure
Internals of Combine-Split-Unit
Internals of Bitonic Sort
Internals of Insert Sort
Principle
Synthesis Results
Implementation Results
Internals of Merge Sort
Principle
Synthesis Results
Implementation Results
Internals of Scan
Query-Specific Acceleration Demo
TPC-H Query 5 Simplified
TPC-H Query 5
TPC-H Query 6 Modified
L2 GQE Kernel User Guide
GQE Kernel Design
3-in-1 Kernel
Meta Information
Unified Kernel Command
Join Flow
Bloom-Filter Flow
64-bit Partition Flow with Bloom Filter Build/Probe
Aggregate Kernel
GQE Kernel APIs
gqeKernel
gqeAggr
gqePart
GQE Kernel Configuration APIs
class xf::database::gqe::KernelCommand
Overview
Methods
KernelCommand
setBypassOn
setJoinOn
setJoinType
setJoinAppendMode
setBloomfilterOn
setBloomfilterSize
setPartOn
setLogPart
setAggrOn
setDualKeyOn
setJoinBuildProbe
setBloomfilterBuildProbe
setScanColEnable
setWriteColEnable
setRowIDValidEnable
setFilter
getConfigBits
class xf::database::gqe::AggrCommand
Overview
Methods
AggrCommand
Scan
setEvaluation
setEvaluation overload (1)
setEvaluation overload (2)
setFilter
setFilter overload (1)
setShuffle0
setShuffle1
setShuffle2
setShuffle3
setGroupAggr
setGroupAggrs
setMerge
columnMerge
setDirectAggrs
setWriteCol
getConfigBits
getConfigOutBits
L3 GQE Overlay User Guide
GQE L3 Design
Overview
Joiner Design
Workshop Design
Bloom-Filter Design
Class specifications
Example Usage
Group-By Aggregate Design
GQE L3 APIs
class xf::database::gqe::Table
Overview
Methods
Table
addCol
addCol overload (1)
addCol overload (2)
addCol overload (3)
addCol overload (4)
genRowIDWithValidation
genRowIDWithValidation overload (1)
genRowIDWithValidation overload (2)
genRowIDWithValidation overload (3)
setRowNum
getRowNum
getSecRowNum
getColNum
getSecNum
checkSecNum
getColTypeSize
getColPointer
getValColPointer
getValColPointer overload (1)
getColPointer
setColNames
getColNames
getRowIDColName
getValidColName
getRowIDEnableFlag
getValidEnableFlag
~Table
info
class xf::database::gqe::Joiner
Overview
Inherited Members
Methods
Joiner
run
class xf::database::gqe::BloomFilter
Overview
Methods
BloomFilter
build
merge
getHashTable
getBloomFilterSize
class xf::database::gqe::Filter
Overview
Inherited Members
Methods
Filter
~Filter
run
class xf::database::gqe::Aggregator
Overview
Methods
Aggregator
aggregate
class xf::database::gqe::TableSection
class xf::database::gqe::Workshop
Benchmark Result
Benchmark
Compound Sort
Executable Usage
Profiling
Hash Anti-join
Dataset
Executable Usage
Profiling
Hash Group Aggregate
Dataset
Executable Usage
Profiling
Hash Join V2
Dataset
Executable Usage
Profiling
Hash Join V3
Dataset
Executable Usage
Profiling
Hash Join V4
Dataset
Executable Usage
Profiling
Hash Multi-Join
Dataset
Executable Usage
Profiling
Hash Semi-Join
Dataset
Executable Usage
Profiling
TPC-H Queries with GQE
Vitis Data Analytics Library
Introduction
Overview
Overview
License
Trademark Notice
Release Note
2023.2
2023.1
2022.2
2022.1
2021.2
2021.1
2020.2
2020.1
Requirements
FPGA Accelerator Card
Software Platform
Development Tools
Design Flows
Vitis Data Analytics Library Tutorial
Data Analytics and Hardware Acceleration
How Vitis Data Analytics Library Works
L3 API – CSV Scanner Engine
Target Audience and Major Features
Command to Run L3 cases
L2 API – CSV Scanner kernels
Target Audience and Major Features
Command to Run L2 cases
L1 API
Target Audience and Major Features
Command to Run L1 cases
L1 User Guide
Hardware Classes
template class xf::data_analytics::classification::logisticRegressionPredict
Overview
Methods
pickFromK
pick
predict
setWeight
setIntercept
template class xf::data_analytics::regression::linearLeastSquareRegressionPredict
Overview
Methods
setWeight
setIntercept
predict
template class xf::data_analytics::regression::LASSORegressionPredict
Overview
Methods
setWeight
setIntercept
predict
template class xf::data_analytics::regression::ridgeRegressionPredict
Overview
Methods
setWeight
setIntercept
predict
template class xf::data_analytics::common::SGDFramework
Overview
Methods
seedInitialization
setTrainingConfigs
setTrainingDataParams
initGradientParams
calcGradient
updateParams
train
Hardware Functions in ``xf::data_analytics::classification``
Hardware Functions in xf::data_analytics::classification
decisionTreePredict
axiVarColToStreams
naiveBayesTrain
naiveBayesPredict
svmPredict
Hardware Functions in xf::data_analytics::clustering
kMeansPredict
Hardware Functions in xf::data_analytics::regression
decisionTreePredict
Hardware Functions in xf::data_analytics::text
Hardware Functions in xf::data_analytics::dataframe
csvParser
csvParser overload (1)
csvParser overload (2)
jsonParser
readFromDataFrame
writeToDataFrame
Software C Functions
xf_re_compile
Design Internals
CSV Parser
Features
Overall Structure
JSON Parser
Features
Limitations
Overall Structure
Decision Tree (Predict)
Overview
Algorithm (predict)
Implementation (predict)
Profiling
K-Means (Predict)
Linear Regression (Predict)
Linear Least Square Regression
LASSO Regression
Ridge Regression
Implementation (inference)
Logistic Regression (Predict)
Logistic Regression Classifier
Implementation (inference)
Multinomial Naive Bayes
Overview
Implemention
Resource Utilization
Benchmark Result on Board
Internals of svm_predict
Regular Expression Virtual Machine (regex-VM)
Overview
User Guide
Regex-VM Coverage
Regex-VM Usage
Implemention
Profiling
WriteToDataFrame
Data Frame format (on DDR)
Input Data Stream
Overall Structure
ReadFromDataframe
Input Data
Output Data Stream
Overall Structure
StringCompare
Overview
Implementation
string EQUAL
string IN
string LIKE
Performance and Resource
string IN
string LIKE
L2 User Guide
Kernel Templates in ``xf::data_analytics::clustering``
Kernel Templates in xf::data_analytics::clustering
kMeansTrain
Kernel Templates xf::data_analytics::regression
linearLeastSquareRegressionSGDTrain
ridgeRegressionSGDTrain
LASSORegressionSGDTrain
Kernel Templates in xf::data_analytics::text
reEngine
Kernel Templates in xf::data_analytics::dataframe
csv_scanner
Kernel Templates in xf::data_analytics::geospatial
knn
strtreeTop
Design Internals
Decision Tree (training)
Overview
Basic algorithm
Implementation
Resource Utilization
Internals of kMeansTaim
Training Resources(Device: U250)
Training Performance(Device: U250)
Random Forest (training)
Overview
Basic algorithm
Implementation
Resource Utilization
Stochastic Gradient Descent Framework
Linear Least Sqaure Regression Training
LASSO Regression Training
Ridge Regression Training
Implementation (Training)
Internals of svm_train
Overview
Basic algorithm
Implementation
Config description
Resource Utilization
Benchmark Result on Board
Regular Expression Engine (reEngine)
Overview
User Guide
reEngine Usage
Implemention
Profiling
GeoIP Engine
Overview
Implementation
Input requirements
Kernel Design
Resource Utilization
Two Gram Predicate
Overview
Implementation
Resource Utilization
STRTree Engine
Overview
Algorithm
Implementation
blockSort
mergeTreeSort
Resource Utilization
GeoSpatial K-nearest Neighbors
Overview
Kernel Implemention
End2End Performance
Resource Utilization
L3 User Guide
Software Acceleration Classes
enum xf::data_analytics::text::re::ErrCode
Overview
Detailed Documentation
Enum Values
class xf::data_analytics::text::re::RegexEngine
Overview
Methods
RegexEngine
~RegexEngine
compile
getCpgpNm
match
RegexEngine
class sssd_engine::DataEngineConfig
Overview
Methods
DataEngineConfig
genConfigBits
class sssd_engine::data_engine_sc::DataEngine
Overview
Fields
Methods
DataEngine
~DataEngine
pushRequest
release
class sssd_engine::SmartSSDCache
Overview
Methods
getCardNum
SmartSSDCache
~SmartSSDCache
addFile
scanFile
listFiles
print_input
print_output
release
Regular Expression Acceleration
Getting Started
Limitation
Example Usage
CSV Scanner
Getting Started
Limitation
Example Usage
Benchmark Result
Naive Bayes
Dataset
Executable Usage
Profiling
Support Vector Machine
Dataset
Executable Usage
Profiling
Log Analyzer
Dataset
Executable Usage
Profiling
Duplicate Record Match
Dataset
Executable Usage
Profiling
Vitis Data Compression Library
Introduction
Overview
Requirements
Software Platform
PCIE Accelerator Card
License
Trademark Notice
Release Note
2022.2
2022.1
2021.2
2021.1
2020.2
User Guide
Typical Use Cases
L1 Module User Guide
Primitive Overview
Stream-based Interface
Primitive APIs in ``xf::compression``
blockPacker
huffmanDecoderLL
huffmanDecoder
huffmanEncoderStream
lz4Compress
lz4Decompress
lzDecompress
lzMultiByteDecompress
lzBestMatchFilter
lzBestMatchFilter overload (1)
lzBooster
lzBooster overload (1)
lzBooster overload (2)
lzFilter
snappyCompress
snappyDecompress
zstdCompressCore
zstdCompressStreaming
zstdCompressQuadCore
zstdCompressMultiCoreStreaming
zstdDecompressStream
zstdDecompressCore
L2 Kernel User Guide
L2 Kernel Demos
Kernel APIs Reference
Global Functions
xilAdler32
xilChecksum32
xilCrc32
xilGzipCompressFixedStreaming
xilGzipCompBlock
xilGzipComp
xilGzipCompressStreaming
xilLz4Compress
xilLz4CompressStream
xilLz4Decompress
xilLz4DecompressStream
xilLz4Packer
xilSnappyCompress
xilSnappyCompressStream
xilSnappyDecompress
xilSnappyDecompressStream
xilSnappyDecompressStream overload (1)
xilSnappyDecompressStream overload (2)
xilZlibCompressFull
xilHuffmanKernel
xilLz77Compress
xilTreegenKernel
xilZstdCompress
xilZstdCompress overload (1)
xilZstdCompress overload (2)
xilZstdDecompressStream
Demos
List of Demos
Xilinx GZip Compression and Decompression
Executable Usage
Results
Resource Utilization
Compression
Decompression
Performance Data
Standard GZip Support
Xilinx LZ4 Compression and Decompression
Results
Resource Utilization
Performance Data
Software & Hardware
Executable Usage
Xilinx LZ4-Streaming Compression and Decompression
Results
Resource Utilization
Performance Data
Executable Usage
Xilinx Snappy Compression and Decompression
Results
Resource Utilization
Performance Data
Software & Hardware
Usage
Build Steps
Emulation flows
Hardware
Executable Usage
Xilinx Snappy-Streaming Compression and Decompression
Results
Resource Utilization
Performance Data
Executable Usage
Tests
List of Tests
Xlinx LZ4 Streaming Compression
Executable Usage
Resource Utilization
Performance Data
Xlinx Snappy Compression
Executable Usage
Resource Utilization
Performance Data
Xilinx GZip Compression
Executable Usage
Results
Resource Utilization
Performance Data
Standard GZip Support
Xilinx GZip Streaming Compression
Executable Usage
Results
Resource Utilization
Performance Data
Standard GZip Support
Xilinx GZip Streaming 16KB Compression
Executable Usage
Results
Resource Utilization
Performance Data
Standard GZip Support
Xilinx GZip Streaming 8KB Compression
Executable Usage
Results
Resource Utilization
Performance Data
Standard GZip Support
Xilinx GZip Static Streaming Compression
Executable Usage
Results
Resource Utilization
Performance Data
Standard GZip Support
Xilinx Zlib Streaming Compression
Executable Usage
Results
Resource Utilization
Performance Data
Standard GZip Support
Xilinx Zlib Streaming 16KB Compression
Executable Usage
Results
Resource Utilization
Performance Data
Standard GZip Support
Xilinx Zlib Streaming 8KB Compression
Executable Usage
Results
Resource Utilization
Performance Data
Standard GZip Support
Xilinx Zlib Streaming Static Compression
Executable Usage
Results
Resource Utilization
Performance Data
Standard GZip Support
Xilinx ZSTD Compression
Results
Overall Resource Utilization
Performance Data
Xlinx LZ4 Streaming Decompression
Executable Usage
Resource Utilization
Performance Data
Xilinx Snappy Streaming Decompression
Executable Usage
Resource Utilization
Performance Data
Xilinx ZSTD Decompression
Results
Overall Resource Utilization
Performance Data
L3 Overlay User Guide
L3 Overlay APIs
Kernel Design
LZ Data Compression
Overview
Compression Kernel Design
Decompression Kernel Design
Implemented Algorithms
Overlay API Reference
Demos
List of Demos
LZ4 Application
Executable Usage
GZip Application
Overview
Executable Usage
Benchmark Results
Datasets
Compression Performance
De-Compression Performance
Test Overview
Vitis Data Compression Library
Compression Tutorial
Data compression and Hardware acceleration
Why acceleration is required and how it helps
How Xilinx Data Compression Library Works
L3 API
Executable Usage
L2 API
Commands to Run L2 and L3 cases
L1 API
Command to Run L1 cases
Vitis Data Mover Library
Introduction
Overview
Overview
License
Trademark Notice
Release Note
2023.1
Known issue
Requirements
Software Platform
Development Tools
Design Flows
Shell Environment
HLS Cases Command Line Flow
Data-Mover User Guide
Static Data-Mover User Guide
Table of Contents
Programmable 4D Data-Mover User Guide
Table of Contents
Vitis DSP Library
Introduction
Overview
Requirements
Software Platform
PCIE Accelerator Card
License
Trademark Notice
Release Note
2023.2
2023.1
2022.2
2022.1
2021.2
2021.1
2020.2
2020.1
L1 PL DSP Library User Guide
1-Dimensional(Line) SSR FFT L1 FPGA Module
Overview
Multi-Instance Support
Data Type Support for Synthesis
Fixed Point
Floating Point
Managing Bit Growth in SSR FFT Stages
SSR_FFT_GROW_TO_MAX_WIDTH
SSR_FFT_SCALE
SSR_FFT_NO_SCALE
Configurations for Fixed Point Implementation (Recommended Flow)
Start With Floating Point Model
Fixed Point Modeling and Implementation
Fixed Point Model
Selecting Bit Widths for Inputs
Twiddle Factor or Sine/Cosine Lookup Table Quantization
Choosing the Best Scaling Mode
SSR_FFT_NO_SCALING
SSR_FFT_GROW_TO_MAX_WIDTH
SSR_FFT_SCALE
1-D SSR FFT Library Usage
Fixed Point 1-D SSR FFT Usage
Floating Point 1-D SSR FFT Usage
1-D SSR FFT Input Stream Reading and Writing Considerations
1-D SSR FFT Usage in Dataflow Region Streaming
Streaming Connection
1-D SSR FFT Tests
Launching an Individual Test
2-Dimensional(Matrix) SSR FFT L1 FPGA Module
Overview
Block Level Interface
2-D SSR FFT Architecture
Supported Data Types
L1 API for 2-D SSR FFT
Template Parameters
Constraints on the Choice of Template Parameters
Library Usage
Fixed Point 2-D SSR FFT L1 Module Usage
Floating Point 2-D SSR FFT L1 Module Usage
2-D SSR FFT Examples
2-D Fixed Point Example
Top Level Definition and main Function
Compiling and Building Example HLS Project
2-D Floating Point Example
Top Level Definition and main Function
Compiling and Building Example HLS Project
2-D SSR FFT Tests
Launching an Individual Test
L2 AIE DSP Library User Guide
Introduction
Navigating Content by Design Process
Organization
Using Library Elements within User Defined Graphs
Known Issues
Vitis Tutorials
DSP Library Functions
DDS / Mixer
DDS Mixer
Entry Point
Supported Types
Template Parameters
Access functions
Ports
Design Notes
Scaling
Super Sample Rate Operation
Super Sample Rate Sample to Port Mapping
Implementation Notes
Code Example including constraints
DDS Mixer using lookup tables
Entry Point
Supported Types
Template Parameters
Access functions
Ports
Design Notes
Scaling
SFDR
Super Sample Rate Operation
Implementation Notes
Code Example including constraints
FFT/iFFT
Entry Point
Supported Types
Template Parameters
Access functions
Ports
Design Notes
Dynamic Point Size
Super Sample Rate Operation
Super Sample Rate Sample to Port Mapping
Scaling
Saturation
Rounding and Saturation
Cascade Feature
Constraints
Use of single_buffer
Code Example including constraints
Configuration Notes
Configuration for performance vs resource
Scenarios
Parameter Legality Notes
DFT
Entry Point
Supported Types
Template Parameters
Access functions
Ports
Design Notes
Scaling
Batch Processing
IO Buffer size
Cascaded Kernels
Point Size and Padding
Maximum Point Size
Zero padding
Code Example
Mixed Radix FFT
Entry Point
Supported Types
Template Parameters
Access functions
Ports
Design Notes
Dynamic Point Size
Super Sample Rate Operation
Scaling
Rounding and Saturation
Cascade Feature
API type
Constraints
Applying Design Constraints
Code Example
Configuration Notes
Configuration for performance vs resource
FFT Window
Entry Point
Supported Types
Template Parameters
Access functions
Ports
Design Notes
Dynamic Point Size
Super Sample Rate Operation
Super Sample Rate Sample to Port Mapping
Scaling
Saturation
Constraints
Code Example
Filters
Entry Point
Supported Types
Template Parameters
Access functions
Ports
Design Notes
Coefficient array for Filters
Static coefficients
Static Coefficients - array size
Reloadable coefficients
Reloadable Coefficients - array size for non-SSR cases
Reloadable Coefficients - array dimensions for SSR cases
Window interface for Filters
Multiple Buffer Ports
Stream Input for Asymmetric FIRs
Maximum Window size
Single buffer constraint
Streaming interface for Filters
Stream Output
Stream Input for Asymmetric FIRs
Stream Input for Symmetric FIRs
Setting FIR Frame Size
Setting FIR Length
Maximum FIR Length
Maximum Window based FIRs Length
Maximum Stream based FIRs Length
Minimum Cascade Length
Optimum Cascade Length
Super Sample Rate
Super Sample Rate - Operation Modes
Super Sample Rate - Resource Utilization
Super Sample Rate - Port Utilization & Throughput
Super Sample Rate - Coefficient & Data distribution
Super Sample Rate - Coefficient & Data distribution - Resampling Limitations
Super Sample Rate - Sample to Port Mapping
Super Sample Rate - Interpolation polyphases
Super Sample Rate - Decimation polyphases
Constraints
Code Example including constraints
Configuration Notes
Matrix Multiply
Entry Point
Supported Types
Template Parameters
Access functions
Ports
Design Notes
Tiling
Tiling Schemes and Data Type Combinations
Tiling Parameters
Cascaded kernels
Constraints
Code Example including constraints
Matrix-Vector Multiply
Entry Point
Supported Types
Template Parameters
Access functions
Ports
Design Notes
Cascaded Kernels
Constraints
Code Example
Sample Delay
Entry Point
Supported Types
Template Parameters
Access functions
Ports
Design Notes
Widget API Cast
Entry Point
Supported Types
Template Parameters
Access functions
Ports
Design Notes
Code Example including constraints
Widget Real to Complex
Entry Point
Supported Types
Template Parameters
Access functions
Ports
Design Notes
Code Example including constraints
Compiling and Simulating
Library Element Unit Test
Compiling using Makefile
Running compilation
Configuring testcase
Selecting TARGET
Troubleshooting Compilation
Compilation arguments
Stack size allocation
Invalid throughput and/or latency
Library Element Configuration Parameters
DDS/Mixer Configuration Parameters
DFT configuration parameters
FFT configuration parameters
FFT Window configuration parameters
FIR configuration parameters
Matrix Multiply Configuration Parameters
Matrix Vector Multiply Configuration Parameters
Mixed Radix FFT configuration parameters
Sample Delay Configuration Parameters
Widgets Configuration Parameters
Benchmark/QoR
Latency and Throughput
DDS/Mixer
DFT
FFT IFFT
FFT Window
Filters
Matrix Multiply
Matrix Vector Multiply
Mixed Radix FFT
Sample Delay
Widgets
API Reference
API Reference Overview
DDS Mixer
template class xf::dsp::aie::mixer::dds_mixer::dds_mixer_graph
Overview
Fields
Methods
getKernels
dds_mixer_graph
template class xf::dsp::aie::mixer::dds_mixer::dds_mixer_lut_graph
Overview
Fields
Methods
getKernels
dds_mixer_lut_graph
DFT
template class xf::dsp::aie::fft::dft::dft_graph
Overview
Fields
Methods
getKernels
dft_graph
Graph utils
class xf::dsp::aie::empty
class xf::dsp::aie::no_port
FFT IFFT
template class xf::dsp::aie::fft::dit_1ch::fft_ifft_dit_1ch_graph
Overview
Fields
Methods
fft_ifft_dit_1ch_graph
template class xf::dsp::aie::fft::dit_1ch::fft_ifft_dit_1ch_graph <TT_DATA, TT_TWIDDLE, TP_POINT_SIZE, TP_FFT_NIFFT, TP_SHIFT, TP_CASC_LEN, TP_DYN_PT_SIZE, TP_WINDOW_VSIZE, kStreamAPI, 0, TP_USE_WIDGETS, TP_RND, TP_SAT, TP_INDEX, TP_ORIG_PAR_POWER>
Overview
Fields
Methods
fft_ifft_dit_1ch_graph
template class xf::dsp::aie::fft::dit_1ch::fft_ifft_dit_1ch_graph <TT_DATA, TT_TWIDDLE, TP_POINT_SIZE, TP_FFT_NIFFT, TP_SHIFT, TP_CASC_LEN, TP_DYN_PT_SIZE, TP_WINDOW_VSIZE, kWindowAPI, 0, TP_USE_WIDGETS, TP_RND, TP_SAT, TP_INDEX, TP_ORIG_PAR_POWER>
Overview
Fields
Methods
getKernels
fft_ifft_dit_1ch_graph
template class xf::dsp::aie::fft::mixed_radix_fft::mixed_radix_fft_graph
Overview
Fields
Methods
mixed_radix_fft_graph
FFT Window
FFT Window utils
Overview
Global Functions
xf::dsp::aie::fft::windowfn::getHammingWindow
xf::dsp::aie::fft::windowfn::getHannWindow
xf::dsp::aie::fft::windowfn::getBlackmanWindow
xf::dsp::aie::fft::windowfn::getKaiserWindow
template class xf::dsp::aie::fft::windowfn::fft_window_graph
Overview
Fields
Methods
getKernels
fft_window_graph
FIRs
template class xf::dsp::aie::fir::decimate_asym::fir_decimate_asym_graph
template struct xf::dsp::aie::fir::decimate_asym::fir_decimate_asym_graph::ssr_params
template struct xf::dsp::aie::fir::decimate_asym::fir_decimate_asym_graph::tmp_ssr_params
template class xf::dsp::aie::fir::decimate_hb::fir_decimate_hb_graph
template struct xf::dsp::aie::fir::decimate_hb::fir_decimate_hb_graph::aieml_ssr_params
template struct xf::dsp::aie::fir::decimate_hb::fir_decimate_hb_graph::ct_fir_params
template struct xf::dsp::aie::fir::decimate_hb::fir_decimate_hb_graph::hb_dec_graph_params
template struct xf::dsp::aie::fir::decimate_hb::fir_decimate_hb_graph::sr_asym_graph_params
template struct xf::dsp::aie::fir::decimate_hb::fir_decimate_hb_graph::tmp_ssr_params
template class xf::dsp::aie::fir::decimate_sym::fir_decimate_sym_graph
struct xf::dsp::aie::fir::decimate_sym::fir_decimate_sym_graph::ssr_params
template class xf::dsp::aie::fir::interpolate_asym::fir_interpolate_asym_graph
struct xf::dsp::aie::fir::interpolate_asym::fir_interpolate_asym_graph::ssr_params
template struct xf::dsp::aie::fir::interpolate_asym::fir_interpolate_asym_graph::tmp_ssr_params
template class xf::dsp::aie::fir::interpolate_hb::fir_interpolate_hb_graph
template struct xf::dsp::aie::fir::interpolate_hb::fir_interpolate_hb_graph::aieml_ssr_params
template struct xf::dsp::aie::fir::interpolate_hb::fir_interpolate_hb_graph::ct_fir_params
template struct xf::dsp::aie::fir::interpolate_hb::fir_interpolate_hb_graph::sr_asym_graph_params
template struct xf::dsp::aie::fir::interpolate_hb::fir_interpolate_hb_graph::ssr_params
template class xf::dsp::aie::fir::resampler::fir_resampler_graph
template struct xf::dsp::aie::fir::resampler::fir_resampler_graph::ssr_params
template struct xf::dsp::aie::fir::resampler::fir_resampler_graph::tmp_ssr_params
template class xf::dsp::aie::fir::sr_asym::fir_sr_asym_graph
struct xf::dsp::aie::fir::sr_asym::fir_sr_asym_graph::first_casc_params
template struct xf::dsp::aie::fir::sr_asym::fir_sr_asym_graph::ssr_params
template struct xf::dsp::aie::fir::sr_asym::fir_sr_asym_graph::tmp_ssr_params
template class xf::dsp::aie::fir::sr_sym::fir_sr_sym_graph
struct xf::dsp::aie::fir::sr_sym::fir_sr_sym_graph::ssr_params
GeMM
template class xf::dsp::aie::blas::matrix_mult::ConditionalWidget
Overview
template class xf::dsp::aie::blas::matrix_mult::matrix_mult_graph
struct xf::dsp::aie::blas::matrix_mult::matrix_mult_graph::no_kernel
GeMV
template class xf::dsp::aie::blas::matrix_vector_mul::matrix_vector_mul_graph
Overview
Fields
Methods
getKernels
matrix_vector_mul_graph
Sample Delay
template class xf::dsp::aie::sample_delay::sample_delay_graph
Overview
Fields
Methods
sample_delay_graph
sample_delay_graph overload (1)
Widgets
template class xf::dsp::aie::widget::api_cast::widget_api_cast_graph
Overview
Fields
Methods
getKernels
widget_api_cast_graph
template class xf::dsp::aie::widget::real2complex::widget_real2complex_graph
Overview
Fields
Methods
getKernels
widget_real2complex_graph
Vitis Graph Library
Introduction
Overview
Requirements
Software Platform
PCIE Accelerator Card
License
Trademark Notice
Release Note
2022.2
2022.1
2021.2
Vitis Graph Library Tutorial
Get and Run the Vitis Graph Library
Get the Dependencies
Setup Environment
Download the Vitis Graph Library
Run a L3 Example
Run a L2 Example
Run a L1 Example
How Vitis Graph Library Works
L3 API
Target Audience
Example Usage
L2 API
Target Audience
Example Usage
L1 API
Target Audience
L1 User Guide
API Document
namespace graph
namespace enums
Design Internals
Similarity Primitives
Internal Design of General Similarity
Interface
Implemention
Profiling and Benchmarks
Internal Design of Sparse Similarity
Interface
Implemention
Profiling and Benchmarks
Internal Design of Dense Similarity
Interface
Implemention
Profiling and Benchmarks
Top K Sort
Overview
Algorithm
Implemention
Profiling
L2 User Guide
API Document
namespace graph
namespace internal
namespace bfs
namespace calc_degree
template union xf::graph::internal::calc_degree::f_cast
template union xf::graph::internal::calc_degree::f_cast <double>
template union xf::graph::internal::calc_degree::f_cast <ap_uint <64>>
template union xf::graph::internal::calc_degree::f_cast <ap_uint <32>>
template union xf::graph::internal::calc_degree::f_cast <float>
namespace connected_components
namespace convert_csr_csc
namespace diameter
struct xf::graph::internal::diameter::IdWeight
template union xf::graph::internal::diameter::f_cast
template union xf::graph::internal::diameter::f_cast <float>
namespace hash_group_aggregate
template struct xf::graph::internal::hash_group_aggregate::COLUMN_DATA
namespace label_propagation
template struct xf::graph::internal::label_propagation::COLUMN_DATA
namespace mis
namespace mssp
namespace mst
namespace pagerank
namespace pagerankMultiChannel
template class xf::graph::internal::pagerankMultiChannel::cache
namespace scc
namespace sssp
namespace nopred
namespace pred
namespace triangle_count
orderStrmInterNum
template struct xf::graph::internal::AggRAM_base
template struct xf::graph::internal::CkKins
struct xf::graph::internal::GetVout
template struct xf::graph::internal::ValAddr
template struct xf::graph::internal::unitCidGain
struct xf::graph::internal::unitCidGain_d
struct xf::graph::internal::unitCidGain_ll
struct xf::graph::internal::unitCidTarKin
struct xf::graph::internal::unitCkKin
struct xf::graph::internal::unitECW
struct xf::graph::internal::unitEW
struct xf::graph::internal::unitKiSelf
struct xf::graph::internal::unitNCkKin
struct xf::graph::internal::unitVC
struct xf::graph::internal::unitVCD
struct xf::graph::internal::unitVCDN
struct xf::graph::internal::unitVCDe
struct xf::graph::internal::unitVCN
struct xf::graph::internal::unitVCNKi
struct xf::graph::internal::unitVD
struct xf::graph::internal::unitVF
union xf::graph::internal::DoubleUnit64
union xf::graph::internal::GetAggout
union xf::graph::internal::GetCout
union xf::graph::internal::GetEout
union xf::graph::internal::StrmBus_L
union xf::graph::internal::StrmBus_M
union xf::graph::internal::StrmBus_S
union xf::graph::internal::StrmBus_XL
template union xf::graph::internal::f_cast
template union xf::graph::internal::f_cast <unsigned int>
template union xf::graph::internal::f_cast <float>
template union xf::graph::internal::f_cast <int>
template union xf::graph::internal::f_cast <ap_uint <64>>
template union xf::graph::internal::f_cast <long long>
template union xf::graph::internal::f_cast <ap_uint <32>>
template union xf::graph::internal::f_cast <double>
template union xf::graph::internal::f_cast_
template union xf::graph::internal::f_cast_ <DT, ap_uint <64>>
template union xf::graph::internal::f_cast_ <DT, ap_uint <32>>
union xf::graph::internal::uint2int32
template class xf::graph::internal::AggRAM
template class xf::graph::internal::AxiMap
template class xf::graph::internal::HashAgg
template class xf::graph::internal::ScanAgg
namespace merge
template class xf::graph::merge::AggRAM
template class xf::graph::merge::HashAgg
template class xf::graph::merge::ScanAgg
template class xf::graph::merge::ShiftUpdate
Design Internals
Internal Design of Breadth-first Search
Overview
Algorithm
Interface
Implementation
Profiling
Internal Design of Single Source Shortest Path
Overview
Algorithm
Interface
Implementation
Profiling
Internal Design of Connected Component
Overview
Algorithm
Interface
Implemention
Profiling and Benchmarks
Internal Design of Strongly Connected Component
Overview
Algorithm
Interface
Implemention
Profiling and Benchmarks
Internal Design of Triangle Counting
Overview
Algorithm
Implemention
Profiling
Benchmark
Internal Design of Label Propagation
Overview
Algorithm
Implemention
Profiling
Benchmark
Internal Design of PageRank
Overview
Algorithm
Implemention
Profiling
Internal Design of PageRankMultiChannels
Overview
Algorithm
Implemention
Profiling
Internal Design of CalcuDgree
Overview
Algorithm
Implemention
Profiling
Internal Design of Convert CSC CSR
Overview
Algorithm
Implemention
Profiling
Internal Design of two hop path count
Overview
Implementation
Interface
Internal Design of Louvain Modularity
Overview
Algorithm
Internal Design of Dense Similarity with Coefficient
Interface
Implemention
Profiling and Benchmarks
Internal Design of Renumber
Overview
Implementation
Interface
Internal Design of Minimum Spanning Tree
Overview
Algorithm
Interface
Implementation
Resource
Internal Design of Estimated Diameter
Overview
Algorithm
Interface
Implementation
Resource
Internal Design of Maximal Independent Set
Overview
Algorithm
Interface
Implementation
Resources
Internal Design of Merge
Overview
Implementation
Algorithm
Conclusion
L3 User Guide
User Guide
Getting Started
Software Requirements
Hardware Requirements
Environment Setup
Build the dynamic library
Run the testcases
Running Examples
Basic Flow
Example
Asynchronous Execution
Example of using multiple requests
Louvain Paritition Demo
Linear Louvain Partition Flow
BFS Louvain Partition Flow
Louvain Modularity Launch Demo
Launch u50 Flow
Launch u55c Flow
API Document
namespace xf::graph::L3
louvainModularity
twoHop
pageRankWeight
shortestPath
cosineSimilaritySSSparse
jaccardSimilaritySSSparse
cosineSimilarityAPSparse
jaccardSimilarityAPSparse
cosineSimilaritySSDense
cosineSimilaritySSDenseMultiCardBlocking
cosineSimilaritySSDenseMultiCard
jaccardSimilaritySSDense
cosineSimilarityAPDense
jaccardSimilarityAPDense
knnSimilaritySSSparse
knnSimilarinyAPSparse
knnSimilaritySSDense
knnSimilarityAPDense
knnSimilarityAPSparse
triangleCount
labelPropagation
bfs
wcc
scc
convertCsrCsc
L3 class
class xf::graph::L3::Handle
Overview
Fields
TigerGraph Plugin
Benchmark
Connected Component
Executable Usage
Profiling
Strongly Connected Component
Executable Usage
Profiling
Triangle Counting
Executable Usage
Profiling
Label Propagation
Executable Usage
Profiling
PageRank
Executable Usage
Profiling
PageRank MultiChannels
Executable Usage
Profiling
Single Source Shortest Path
Executable Usage
Profiling
Two hop path count
Executable Usage
Profiling
Louvain Modularity
Executable Usage
Profiling
Renumber
Executable Usage
Profiling
Maximal Independent set
Executable Usage
Profiling
Benchmark
Merge
Executable Usage
Profiling
Vitis HPC Library
Introduction
Overview
Requirements
Software Platform
PCIE Accelerator Card
License
Trademark Notice
Release Note
2020.1
2021.1
User Guide
Python Environment Setup Guide
L1 Primitives User Guide
Introduction of L1 Primitives
RTM Introduction
Mathematics in RTM
1. Wave equation and the finite difference method
2. Imaging
3. Boundary saving scheme
Design information of L1 primitives
1. Stencil2D
2. RTM2D
Forward streaming module
Backward streaming module
3. Stencil3D
4. RTM3D
Conjugate Gradient Solver Introduction
Conjugate Gradient Algorithm
MLP Introduction
L1 APIs
MLP
CG Solver
Reverse Time Migration
Data Movers
L1 Test
1. Set up Python environment
2. Set up Vitis_hls environment
3. Test L1 primitives
L2 Kernels User Guide
Introduction of L2 Kernels
RTM Kernels
2D-RTM forward kernel
2D-RTM backward kernel
3D-RTM forward kernel
MLP Kernels
CG Kernels
GEMV-based Conjugate Gradient Solver with Jacobi Preconditioner
Introduction
Executable Usage
Environment Setup (Step 1)
Build Kernel (Step 2)
Prepare Data (Step 3)
Randomly-Generated Data (Optional)
Users’ data
Run on FPGA with Example Data (Step 4)
Check Device
Benchmark Random Dataset
Usage
Resource Utilization
Benchmark Results on Alveo U50 FPGA
Power Consumption on FPGA
SPMV-based Conjugate Gradient Solver with Jacobi Preconditioner
Introduction
Benchmark on Hardware
Environment Setup (Step 1)
Hardware Build (Step 2)
Prepare Data (Step 3)
Run on FPGA (Step 4)
Check Device
Benchmark
Usage
Resource Utilization on Alveo U280
Benchmark Results on Alveo U280 FPGA
Convergence
L2 Kernel APIs
L2 Kernels Tests Guide
RTM Kernel Test
Set up Python environment
Set up Vitis environment
Test RTM kernels
Test 2D RTM
Forward kernel
Backward kernel
Test 3D RTM
Forward kernel with HBC/RBC boundary condition
CG Kernel Test
Set up Python environment
Set up Vitis environment
Test CG kernels
GEMV-based CG solver
SPMV-based CG solver
FCN Kernel Test
Test FCN kernels
Benchmark
Performance
Conjugate Gradient Algorithm
GEMV-based CG
SPMV-based CG
Benchmark Test Flow
Vitis Motor Control Library
Introduction
Overview
Requirements
Software Platform
PCIE Accelerator Card
License
Trademark Notice
Release Note
2023.2
2023.1
Vitis Motor Control Library Tutorial
L1 Benchmark
Sensor_based_FOC
Executable Usage
Profiling
QEI
Executable Usage
Profiling
SVPWM_DUTY
Executable Usage
Profiling
PWM_GEN
Executable Usage
Profiling
L1 User Guide
API Document
Primitive APIs in ``xf::motorcontrol``
namespace details
struct xf::motorcontrol::details::QEI_EdgeInfo
template struct xf::motorcontrol::details::gen_sampler_pkg
struct xf::motorcontrol::details::pwmPassedArgs
template struct xf::motorcontrol::details::pwmStrmIO
enum xf::motorcontrol::FOC_Mode
enum xf::motorcontrol::MODE_PWM_DC_SRC
enum xf::motorcontrol::MODE_PWM_PHASE_SHIFT
Design Internals
Sensor_Based_FOC
Overview
Implemention
Profiling
SVPWM_DUTY
Overview
Implemention
Profiling
PWM_GEN
Overview
Implemention
Profiling
QEI
Overview
Implemention
Profiling
Vitis Quantitative Finance Library
Introduction
Overview
Requirements
Software Platform
PCIE Accelerator Card
License
Trademark Notice
Release Note
2023.2
2023.1
Version 1.0
Version 0.5
User Guide
Pricing Models and Numerical Methods
Black-Scholes Model
Overview
Black-Scholes Model
\(Ito\) lemma and direct corollary
Corollary: lognormal property of \(S\)
Implementation of B-S model
Heston Model
Overview
Stochastic Process Equations of the Heston Model
Partial Differential Equation (PED) of Heston Model
Implementations
References
Hull-White Model
Overview
Implementation
Black-Karasinski Model
Overview
Implementation
Cox-Ingersoll-Ross Model
Overview
Implementation
Extended Cox-Ingersoll-Ross Model
Overview
Implementation
Vasicek Model
Overview
Implementation
G2 Model
Overview
Implementation
Monte Carlo Simulation
Overview
Framework
Antithetic paths
Finite Difference Methods
Overview
Implementation
Assumptions/Limitations
Dataflow Description
Precalculate algorithm fixed matrices
Explicit estimation at timestep t
Implicit correction in s
Implicit correction in v
Extract price grid
References
Binomial Tree, Cox-Ross-Rubinstein, Method
Overview
References
Internal Design of Tree Lattice
Overview
Implemention
Heston Model Closed-Form Solution
Overview
References
Merton 76 Closed-Form Solution
Overview
References
Garman-Kohlhagen Closed-Form Solution
Overview
References
Quanto Closed-Form Solution
Overview
References
Hull White Analytic Closed-Form Solution
Overview
Portfolio Optimisation
Caveat
Overview
Global Minimum Variance Portfolio
Efficient Portfolio
Tangency Portfolio
Efficient Portfolio of Risky and Risk Free Assets
Credit Default Swap
Overview
L1 Module User Guide
Random Number Generator
Overview
Uniform Distributed Random Number Generator
Algorithm
Implementation Details
Normal Distributed Random Number Generator (NRNG)
Inverse cumulative distribution transformation based RNG
Box-Muller transformation based NRNG
Multi Variate Normal Distribution RNG
PRNG (xoshiro128)
Overview
Implementation
Singular Value Decomposition (SVD)
Overview
Theory
Jacobi Methods
Implementation
SVD workflow:
Profiling
Tridiagonal Matrix Solver
Overview
Implementation
Pentadiagonal Matrix Solver
Overview
Implementation
Sobol Sequence Generator
Overview
Algorithm
Gray Code Implementation
Sobol Workflow:
Brownian Bridge Transform
Overview
Theory
Generation Algorithm
Implementation
Profiling
Stochastic Process
Overview
Implementation
Ornstein-Uhlenbeck Process
Overview
Implementation
Meshers
Overview
Implementation
Numerical Integration Methods
Overview
Adaptive Trapezoidal Theory
Adaptive Simpson Theory
Romberg Theory
limitations under the License.
Overview
Implementation
Profiling
Covariance Matrix and Regularizaiton
Overview
Algorithm
Covariance Matrix
Covariance Regularizaiton
Implementation
Profiling
Probability Distribution
Overview
Interpolation
Overview
Algorithm & Implementation
Linear interpolation
Cubic interpolation
Bicubic spline interpolation
RNG
RNG
Defined in <xf_fintech/rng.hpp>
XoShiRo128
Defined in <xf_fintech/xoshiro128.hpp>
SobolRsg
Defined in <xf_fintech/sobol_rsg.hpp>
BrownianBridge
Defined in <xf_fintech/brownian_bridge.hpp>
TrinomialTree
Defined in <xf_fintech/trinomial_tree.hpp>
TreeLattice
Defined in <xf_fintech/tree_lattice.hpp>
1DMesher
Defined in <xf_fintech/fdmmesher.hpp>
OrnsteinUhlenbeckProcess
Defined in <xf_fintech/ornstein_uhlenbeck_process.hpp>
StochasticProcess1D
Defined in <xf_fintech/stochastic_process.hpp>
HWModel
Defined in <xf_fintech/hw_model.hpp>
G2Model
Defined in <xf_fintech/g2_model.hpp>
ECIRModel
Defined in <xf_fintech/ecir_model.hpp>
CIRModel
Defined in <xf_fintech/cir_model.hpp>
VModel
Defined in <xf_fintech/v_model.hpp>
HestonModel
Defined in <xf_fintech/heston_model.hpp>
BKModel
Defined in <xf_fintech/bk_model.hpp>
BSModel
Defined in <xf_fintech/bs_model.hpp>
PCA
Defined in <xf_fintech/pca.hpp>
BicubicSplineInterpolation
Defined in <xf_fintech/bicubic_spline_interpolation.hpp>
CubicInterpolation
Defined in <xf_fintech/cubic_interpolation.hpp>
BinomialDistribution
Defined in <xf_fintech/binomial_distribution.hpp>
bernoulliPMF
bernoulliCDF
covCoreMatrix
covCoreStrm
covReHardThreshold
covReSoftThreshold
covReBand
covReTaper
gammaCDF
svd
linearImpl
mcSimulation
normalPDF
normalCDF
normalICDF
logNormalPDF
logNormalCDF
logNormalICDF
pentadiagCr
poissonPMF
poissonCDF
poissonICDF
polyfit
polyval
polyint
polyder
trap_integrate
simp_integrate
romberg_integrate
boxMullerTransform
inverseCumulativeNormalPPND7
inverseCumulativeNormalAcklam
trsvCore
L2 Kernel User Guide
Pricing Engine Overview
Pricing Engine Kernel Design
Internal Design of European Option Pricing Engine
Overview
Implementation
Internal Design of MCEuropeanHestonEngine
Overview
Implementation
Variations
Internal Design of Asian Option Pricing Engine
Overview
MCAsianAPEngine
MCAsianASEngine
MCAsianGPEngine
Profiling
Internal Design of Digital Option Pricing Engines
Overview
Implementation
Internal Design of Barrier Option Pricing Engine
Overview
Implementation
MCBarrierEngine
MCBarrierNoBiasEngine
Internal Design of Cliquet Option Pricing Engine
Overview
Implementation
Internal Design of American Option Pricing Engine
Overview
Theory
Implementation
Calibration Process
Pricing Process
MCAmericanEnginePricing
MCAmericanEngine APIs
Internal Design of MCMultiAssetEuropeanHestonEngine
Overview
Implementation
Variations
Internal Design of MCHullWhiteCapFloorEngine
Overview
Implementation
Internal Design of MCEuropeanHestonGreeksEngine
Overview
Implementation
Internal Design of Closed Form Black-Scholes-Merton
Overview
Design Structure
cfBSMEngine (cf_bsm.hpp)
bsm_kernel (bsm_kernel.cpp)
Theoretical throughput
Resource Utilization
Throughput
Internal Design of Closed Form Black-76
Overview
Design Structure
cfB76Engine (cf_b76.hpp)
b76_kernel (b76_kernel.cpp)
Theoretical throughput
Resource Utilization
Throughput
Internal Design of Closed Form Heston
Overview
Design Structure
The Engine (hcf_engine.hpp)
IO Wrapper (hcf_kernel.cpp)
Resource Utilization
Throughput
Internal Design of Closed Form Merton 76
Overview
Design Structure
The Engine (M76Engine.hpp)
IO Wrapper (m76_kernel.cpp)
Resource Utilization
Throughput
Internal Design of Garman Kohlhagen
Overview
Design Structure
The Engine
IO Wrapper (gk_kernel.cpp)
Resource Utilization
Throughput
Internal Design of Quanto
Overview
Design Structure
The Engine
IO Wrapper (quanto_kernel.cpp)
Resource Utilization
Throughput
Internal Design of Cox-Ross-Rubinstein Binomial Tree
Overview
Design Structure
bt_engine (bt_engine.hpp)
binomialtreekernel (binomialtreekernel.cpp)
Resource Utilization
Throughput
Internal Design of Finite-Difference Hull-White Bermudan Swaption Pricing Engine
Overview
Implementation
Mesher
Differential operator
Evolution scheme
Profiling
Internal Design of Finite-Difference G2 Bermudan Swaption Pricing Engine
Overview
Implementation
Mesher
Differential operator
Profiling
Internal Design of Tree Bermudan Swaption Engine
Overview
Implemention
Profiling
Internal Design of CPI CapFloor Engine
Overview
Implemention
Profiling
Internal Design of Inflation CapFloor Engine
Overview
Implemention
Profiling
Internal Design of Zero Coupon Bond Engine
Overview
Implemention
Profiling
limitations under the License.
Overview
Design Structure
PCA HJM Kernel
MC HJM Kernel
Pricer Algorithms
limitations under the License.
Overview
Design Structure
Calibration of the Model
Correlation calibration
Volatility Calibration
Pricing Algorithms
Cap Pricing
Ratchet Floater Pricing
Ratchet Cap Pricing
Internal Architecture
Internal Design of HWA Engine
Implementation
Zero Coupon Bond Price
Equity Option Pricing
Cap/Floor
Implemention
Kernel
Host
Internal Design of CDS Engine
Implementation
Host
Kernel
Internal Design of Black Scholes Local Volatility Solver
Overview
Mathematical Background
Design Details
Test Methodology
Pricing Engine Kernel APIs
CPICapFloorEngine
Defined in <xf_fintech/cpi_capfloor_engine.hpp>
DiscountingBondEngine
Defined in <xf_fintech/discounting_bond_engine.hpp>
InflationCapFloorEngine
Defined in <xf_fintech/inflation_capfloor_engine.hpp>
FdHullWhiteEngine
Defined in <xf_fintech/fd_hullwhite_engine.hpp>
FdG2SwaptionEngine
Defined in <xf_fintech/fd_g2_swaption_engine.hpp>
binomialTreeEngine
cfB76Engine
cfBSMEngine
FdBsLvSolver
FdDouglas
hcfEngine
hjmPcaEngine
hjmMcEngine
hjmEngine
lmmEngine
M76Engine
MCEuropeanEngine
MCEuropeanPriBypassEngine
MCEuropeanHestonEngine
MCMultiAssetEuropeanHestonEngine
MCAmericanEnginePreSamples
MCAmericanEngineCalibrate
MCAmericanEnginePricing
MCAmericanEngine
MCAsianGeometricAPEngine
MCAsianArithmeticAPEngine
MCAsianArithmeticASEngine
MCBarrierNoBiasEngine
MCBarrierEngine
MCCliquetEngine
MCDigitalEngine
MCEuropeanHestonGreeksEngine
MCHullWhiteCapFloorEngine
McmcCore
treeSwaptionEngine
treeSwapEngine
treeCapFloorEngine
treeCallableEngine
Other Engine Kernel Design
Internal Design of Markov Chain Monte Carlo
Overview
The Engine (pop_mcmc.h)
Resource Utilization
Benchmark
Application Scenario
Performance
Test Overview
Vitis Quantitative_Finance Library
Vitis Security Library
Tutorial
Crypto Algorithm Hardware Acceleration
How Vitis Security Library Works
L1 API
Target Audience and Major Features
Input / output interface
Command to Run L1 cases
Introduction
Overview
Requirements
Software Platform
PCIE Accelerator Card
License
Trademark Notice
Release Note
Known Issue
2022.2
2021.2
2021.1
2020.2
2019.2
User Guide
Design Internals
Adler32
Overview
Implementation on FPGA
AES Encryption Algorithms
Original Implementation
Optimized Implementation on FPGA
AES-128 Encryption Performance (Device: U250)
AES-192 Encryption Performance (Device: U250)
AES-256 Encryption Performance (Device: U250)
BLAKE2 Algorithms
Overview
Implementation on FPGA
Performance
BLAKE2B
CBC Mode
Overview
Implementation on FPGA
Profiling
CBC-DES encryption
CBC-DES decryption
CBC-AES128 encryption
CBC-AES128 decryption
CBC-AES192 encryption
CBC-AES192 decryption
CBC-AES256 encryption
CBC-AES256 decryption
CCM Mode
Overview
Implementation on FPGA
Profiling
CCM-AES128 encryption
CCM-AES128 decryption
CCM-AES192 encryption
CCM-AES192 decryption
CCM-AES256 encryption
CCM-AES256 decryption
CFB Mode
Overview
Implementation on FPGA
Profiling
CFB1-DES encryption
CFB1-DES decryption
CFB1-AES128 encryption
CFB1-AES128 decryption
CFB1-AES192 encryption
CFB1-AES192 decryption
CFB1-AES256 encryption
CFB1-AES256 decryption
CFB8-DES encryption
CFB8-DES decryption
CFB8-AES128 encryption
CFB8-AES128 decryption
CFB8-AES192 encryption
CFB8-AES192 decryption
CFB8-AES256 encryption
CFB8-AES256 decryption
CFB128-DES encryption
CFB128-DES decryption
CFB128-AES128 encryption
CFB128-AES128 decryption
CFB128-AES192 encryption
CFB128-AES192 decryption
CFB128-AES256 encryption
CFB128-AES256 decryption
Chacha20 Algorithms
Implementation
Performance (Device: U250)
CRC32
Overview
Implementation on FPGA
CTR Mode
Overview
Implementation on FPGA
Profiling
CTR-AES128 encryption
CTR-AES128 decryption
CTR-AES192 encryption
CTR-AES192 decryption
CTR-AES256 encryption
CTR-AES256 decryption
DES and 3DES Algorithms
Algorithm Flow
Optimized Implementation on FPGA
Performance (Device: VU9P)
DES encryption
DES decryption
3DES encryption
3DES decryption
Digital Signature Algorithm
Implementation
Optimized Implementation on FPGA
Reference
ECB Mode
Overview
Implementation on FPGA
Profiling
ECB-DES encryption
ECB-DES decryption
ECB-AES128 encryption
ECB-AES128 decryption
ECB-AES192 encryption
ECB-AES192 decryption
ECB-AES256 encryption
ECB-AES256 decryption
Elliptic-curve Cryptography
Elliptic-curve Cryptography
Elliptic Curve Digital Signature Algorithm
Edwards-curve Digital Signature Algorithm
Reference
ECDSA nistp256
Overview
Implementation on FPGA
Signing (point multiplication kG)
Verification (point multiplication aG+bP)
ECDSA secp256k1
Overview
Implementation on FPGA
Signing (point multiplication kG)
Verification (point multiplication aG+bP)
ECDSA nistp384
Overview
Implementation on FPGA
Signing (point multiplication kG)
Verification (point multiplication aG+bP)
GCM Mode
Overview
Implementation on FPGA
Profiling
GCM-AES128 encryption
GCM-AES128 decryption
GCM-AES192 encryption
GCM-AES192 decryption
GCM-AES256 encryption
GCM-AES256 decryption
GMAC
Overview
Implementation on FPGA
Profiling
GMAC-AES128
GMAC-AES192
GMAC-AES256
HMAC Algorithms
Configuration
Implementation
Performance (Device: U250)
AES Decryption Algorithms
Original Implementation
Optimized Implementation on FPGA
AES-128 Decryption Performance (Device:U250)
AES-192 Decryption Performance (Device:U250)
AES-256 Decryption Performance (Device:U250)
Keccak-256 Algorithms
Overview
Implementation on FPGA
The MD4/MD5 Message-Digest Algorithms
Overview
Implementation on FPGA
Performance
MD4
MD5
OFB Mode
Overview
Implementation on FPGA
Profiling
OFB-DES encryption
OFB-DES decryption
OFB-AES128 encryption
OFB-AES128 decryption
OFB-AES192 encryption
OFB-AES192 decryption
OFB-AES256 encryption
OFB-AES256 decryption
Poly1305 Algorithm
Overview
Implementation on FPGA
Performance
RC4 Algorithms
Implementation
Performance (Device: U250)
RSA Cryptography
Implementation
Optimized Implementation on FPGA
Reference
SHA-1 Algorithm
Overview
Implementation on FPGA
Performance
SHA-2 Algorithms
Overview
Implementation on FPGA
Performance
SHA-224 and SHA-256
SHA-384, SHA-512, SHA-512/224, and SHA-512/256
Clustering
SHA-3 Algorithms
Overview
Implementation on FPGA
Performance
SHA3-224
SHA3-256
SHA3-384
SHA3-512
SHAKE-128
SHAKE-256
Clustering
SM2
SM2
SM3
SM4
Verifiable Delay Function
Overview
Implementation on FPGA
XTS mode
Overview
Implementation on FPGA
Profiling
XTS-AES128 encryption
XTS-AES128 decryption
XTS-AES256 encryption
XTS-AES256 decryption
Poseidon Hash Algorithm
Overview
Implementation on FPGA
API Functions of ``xf::security``
adler32
adler32 overload (1)
adler32 overload (2)
adler32 overload (3)
blake2b
blake2b overload (1)
desCbcEncrypt
desCbcDecrypt
aes128CbcEncrypt
aes128CbcDecrypt
aes192CbcEncrypt
aes192CbcDecrypt
aes256CbcEncrypt
aes256CbcDecrypt
aes128CcmEncrypt
aes128CcmDecrypt
aes192CcmEncrypt
aes192CcmDecrypt
aes256CcmEncrypt
aes256CcmDecrypt
desCfb1Encrypt
desCfb1Decrypt
aes128Cfb1Encrypt
aes128Cfb1Decrypt
aes192Cfb1Encrypt
aes192Cfb1Decrypt
aes256Cfb1Encrypt
aes256Cfb1Decrypt
desCfb8Encrypt
desCfb8Decrypt
aes128Cfb8Encrypt
aes128Cfb8Decrypt
aes192Cfb8Encrypt
aes192Cfb8Decrypt
aes256Cfb8Encrypt
aes256Cfb8Decrypt
desCfb128Encrypt
desCfb128Decrypt
aes128Cfb128Encrypt
aes128Cfb128Decrypt
aes192Cfb128Encrypt
aes192Cfb128Decrypt
aes256Cfb128Encrypt
aes256Cfb128Decrypt
chacha20
xchacha20
crc32
crc32 overload (1)
crc32 overload (2)
crc32 overload (3)
crc32c
crc32c overload (1)
crc32c overload (2)
aes128CtrEncrypt
aes128CtrDecrypt
aes192CtrEncrypt
aes192CtrDecrypt
aes256CtrEncrypt
aes256CtrDecrypt
desEncrypt
desDecrypt
des3Encrypt
des3Decrypt
desEcbEncrypt
desEcbDecrypt
aes128EcbEncrypt
aes128EcbDecrypt
aes192EcbEncrypt
aes192EcbDecrypt
aes256EcbEncrypt
aes256EcbDecrypt
nistp256Sign
nistp256Verify
aes128GcmEncrypt
aes128GcmDecrypt
aes192GcmEncrypt
aes192GcmDecrypt
aes256GcmEncrypt
aes256GcmDecrypt
aes128Gmac
aes192Gmac
aes256Gmac
hmac
keccak_256
md4
md5
desOfbEncrypt
desOfbDecrypt
aes128OfbEncrypt
aes128OfbDecrypt
aes192OfbEncrypt
aes192OfbDecrypt
aes256OfbEncrypt
aes256OfbDecrypt
poly1305
poly1305MultiChan
rc4
sha1
sha224
sha256
sha3_224
sha3_256
sha3_384
sha3_512
shake128
shake256
sha384
sha512
sha512_t
HMAC_SHA384
HMAC_SHA384 overload (1)
HMAC_SHA384 overload (2)
sm3
evaluate
verifyWesolowski
verifyPietrzak
aes128XtsEncrypt
aes128XtsDecrypt
aes256XtsEncrypt
aes256XtsDecrypt
Inherited Members
Inherited Members
updateKey
updateKey overload (1)
updateKey overload (2)
process
Fields
updateSigningParam
updateSigningParam overload (1)
updateSigningParam overload (2)
updateVerifyingParam
updateVerifyingParam overload (1)
updateVerifyingParam overload (2)
sign
verify
Benchmark
Test Overview
Vitis Solver Library
Introduction
Overview
PL Solver library
AI Engine Solver library
Requirements
Software requirements
Hardware requirements
License
Trademark Notice
Release Note
2023.2
2023.1
2022.1
PL Solver Library User Guide
Vitis Solver Library Tutorial
How Vitis Solver Library Works
L2 API
Target Audience and Major Features
Command to Run L2 cases
L1 API
Target Audience and Major Features
Command to Run L1 cases
L1 PL User Guide
APIs
backSubstitute
cholesky
choleskyInverse
matrixMultiply
matrixMultiply overload (1)
matrixMultiply overload (2)
qrInverse
qrd_cfloat_core
qrd_float_core
qrf
svd
Core Utility
QRD (QR Decomposition)
Overview
Implementation
DataType Supported
Interfaces
Implementation Controls
Specifications
Key Factors
QRF (QR Factorization)
Overview
Implementation
DataType Supported
Interfaces
Implementation Controls
Specifications
Key Factors
QR_Inverse
Overview
Implementation
DataType Supported
Interfaces
Implementation Controls
Specifications
SVD (Singular Value Decomposition)
Overview
Implementation
DataType Supported
Interfaces
Implementation Controls
Specifications
Key Factors
Cholesky_Inverse
Overview
Implementation
DataType Supported
Interfaces
Implementation Controls
Specifications
Cholesky
Overview
Implementation
DataType Supported
Interfaces
Implementation Controls
Specifications
Key Factors
L2 PL User Guide
Supported Numerical Methods
Singular Value Decomposition for symmetric matrix (GESVDJ)
Overview
Theory
Jacobi Methods
Singular Value Decomposition for general matrix (GESVJ)
Overview
Algorithm
Architecture
General QR Decomposition (GEQRF)
Lower-Upper Decomposition (GETRF)
Lower-Upper Decomposition (GETRF_NOPIVOT)
Cholesky Decomposition for SPD matrix (POTRF)
Triangular Solver (GTSV)
Symmetric Linear Solver (POLINEARSOLVER)
Symmetric Matrix Inverse (POMATRIXINVERSE)
General Linear Solver (GELINEARSOLVER)
General Matrix Inverse (GEMATRIXINVERSE)
Triangular Solver with multiple right-hand sides (TRTRS)
Eigenvalue Solver (SYEVJ)
L2 PL APIs
Matrix Decomposition
geqrf
geqrf overload (1)
gesvdj
gesvj
getrf
getrf_nopivot
potrf
Linear Solver
gelinearsolver
gematrixinverse
gtsv
polinearsolver
pomatrixinverse
trtrs
Eigenvalue Solver
syevj
Benchmark
Datasets
Performance
Test Overview
AIE Solver Library User Guide
Introduction
Code Organization
Using Library Elements within Defined Graphs
Compiling and Simulation Using the Makefile
AIE APIs Design Information
Cholesky Decomposition
Introduction
Entry Point
Data Format
Template Parameters
Ports
AIE Kernel
Design Notes
Kernel Interfaces
Performance
Test_1
Test_2
QR Decomposition
Introduction
Entry Point
Template Parameters
Ports
AIE Kernel
Design Notes
Kernel Interfaces
Performance
Test_1
Test_2
Singular value decomposition
Introduction
Entry Point
Template Parameters
Ports
AIE Kernel
Design Notes
Kernel Interfaces
Pseudoinverse
Introduction
Template Parameters
Ports
AIE Kernel
Design Notes
AIE APIs
template class xf::solver::CholeskyGraph
Overview
Fields
Methods
CholeskyGraph
template class xf::solver::QRDComplexFloat
Overview
Fields
Methods
QRDComplexFloat
template class xf::solver::SVDComplexFloat
class xf::solver::PseudoInverseComplexFloat
Vitis SPARSE Library
Introduction
Overview
Requirements
Software Platform
PCIE Accelerator Card
License
Trademark Notice
Release Note
2020.1
2021.1
User Guide
L1 Primitives User Guide
Primitive Overview
1. Scatter-gather logic
2. Row-wise accumulator
3. Buffer and distribute input column vector entries and the column pointers of NNZs
Primitive Implementation Details
Scatter-Gather Logic Implementation
Row-wise Accumulator Implementation
Column Vector Buffering and Distribution Implementation
API Functions of ``xf::sparse``
L2 Kernel User Guide
CSCMV Overview
1. Matrix partitioning and device memory layout
2. The functionality of the CUs
3. Build and test the design
Double Precision SpMV Overview
1. Matrix partitioning
2. Build and test the design
CSCMV Kernel APIs
Double Precision SPMV Kernel APIs
Benchmark Result
SPMV (Double precision)
Dataset
Executable Usage
Profiling
Vitis Ultrasound Library
Introduction
Overview
Requirements
Software Platform
AIE Card
License
Trademark Notice
Release Note
2022.2
2023.1
2023.2
background
theoretical_foundation
Plane Wave and Synthetic Aperture formulation
Plane Wave Formulation
Learning the differences from Plane Wave and Synthetic Aperture
theoretical_foundation_2
Focusing Theory
Synthetic Aperture Formulation
Plane Wave Formulation
Final remarks
theoretical_foundation_3
Theory of apodization
Theory of Interpolation
Features
Features for Ultrasound Library Release
Code structures enhancement
Host code enhancement
Support for AIE full verification flow on VCK190 platform
Details for Ultrasound Library L1
Ultrasound Library - Level 1 (L1)
kernel name: kernel_imagepoints
kernel name: kernel_focusing
Kernel name: kfun_apodization_preprocess
Kernel name: kfun_apodization_main
Kernel name: kernel_delay
Kernel name: kernel_samples
Kernel name: kfun_rfbuf_wrapper
Kernel name: kfun_resamp_wrapper
Kernel name: kfun_genwin_wrapper
Kernel name: kfun_interpolation_wrapper
kernel name: kfun_mult_pre
kernel name: kfun_mult_cascade
Kernel name: absV
Kernel name: cosV
Kernel name: diffMV
Kernel name: diffSV
Kernel name: diffVS
Kernel name: divVS
Kernel name: equalS
Kernel name: lessOrEqualThanS
Kernel name: mulMM
Kernel name: mulVS
Kernel name: mulVV
Kernel name: norm_axis_1
Kernel name: ones
Kernel name: outer
Kernel name: reciprocalV
Kernel name: sqrtV
Kernel name: squareV
Kernel name: sum_axis_1
Kernel name: sumMM
Kernel name: sumVS
Kernel name: sumVV
Kernel name: tileV
Details for Ultrasound Library L2
Ultrasound Library - Level 2 (L2)
Graph name: graph_imagepoints
Graph name: graph_focusing
Graph name: graph_apodization_preprocess
Graph name: graph_apodization
Graph name: graph_delay
Graph name: graph_samples
Graph name: graph_interpolation
Graph name: graph_mult
Graph name: graph_scanline
Graph name: Image Points
Graph name: Delay
Graph name: Delay_PW
Graph name: Focusing
Graph name: Focusing_SA
Graph name: Samples
Graph name: Apodization
Graph name: Apodization_SA
Graph name: bSpline
Details for Ultrasound Library L3
Ultrasound Library - Level 3 (L3)
Scanline_AllinAIE Beamformer
ScanLine Beamformer
PW Beamformer
SA Beamformer
Tutorial
Lab-1: How does Vitis Ultrasound Library work
Setup Environment
Download the Vitis Ultrasound Library
Lab-2: L1/L2 Graph based algorithm acceleration and evaluation for ultrasound tool box case
Lab purpose
Run a L1 Example
Run a L2 Example
L2 APIs Input Arguments
Lab-3: L2 Graph based algorithm acceleration and evaluation for ultrasound All in AIE case
Run a L2 graph_scanline case
Example logs of graph_scanline
Lab-4: L3 Graph based acceleration for ultrasound All in AIE, intergrated with PL and xrt case
Lab purpose
Run L3 All in AIE cases
L3 APIs Input Arguments
Example logs of scanline_AllinAIE
Example logs of plane_wave
Resources
Vitis Utilities Library
Introduction
Overview
Overview
License
Trademark Notice
Tutorial
How Vitis Utils Library Works
HLS hardware utiliy API
Target Audience and Major Features
Command to Run cases
Release Note
2023.2
2023.1
2022.2
2021.2
2021.1
2020.2
2020.1
2019.2
Requirements
Software Platform
Development Tools
Design Flows
Shell Environment
HLS Cases Command Line Flow
Utility User Guide
Stream-Based API Design
Stream-based Interface
API Functions of ``xf::common::utils_hw``
axiToMultiStream
axiToStream
axiToCharStream
axiToStream
makeMux
streamCombine
streamCombine overload (1)
streamCombine overload (2)
streamCombine overload (3)
streamCombine overload (4)
streamDiscard
streamDiscard overload (1)
streamDiscard overload (2)
streamDiscard overload (3)
streamDup
streamDup overload (1)
streamDup overload (2)
streamNToOne
streamNToOne overload (1)
streamNToOne overload (2)
streamNToOne overload (3)
streamNToOne overload (4)
streamNToOne overload (5)
streamNToOne overload (6)
streamOneToN
streamOneToN overload (1)
streamOneToN overload (2)
streamOneToN overload (3)
streamOneToN overload (4)
streamOneToN overload (5)
streamOneToN overload (6)
streamReorder
streamShuffle
streamSplit
streamSplit overload (1)
streamSplit overload (2)
streamSync
streamToAxi
API Class of ``xf::common::utils_hw``
API Class of xf::common::utils_hw
template class xf::common::utils_hw::UramArray
Overview
Methods
memSet
write
read
template class xf::common::utils_hw::cache
Overview
Methods
initSingleOffChip
initDualOffChip
readOnly
readOnly overload (1)
readOnly overload (2)
readOnly overload (3)
readOnly overload (4)
template class xf::common::utils_hw::Multiplexer
Overview
Methods
get
get overload (1)
get overload (2)
put
API Class of xf::common::utils_sw
class xf::common::utils_sw::ArgParser
Overview
Methods
addFlag
addOption
getAs
showUsage
Template Helpers in ``xf::common::utils_hw``
template struct xf::common::utils_hw::PowerOf2
Overview
Fields
template struct xf::common::utils_hw::GCD <_A, 0>
template struct xf::common::utils_hw::LCM
Overview
Fields
Tag Types in ``xf::common::utils_hw``
struct xf::common::utils_hw::LoadBalanceT
struct xf::common::utils_hw::RoundRobinT
struct xf::common::utils_hw::TagSelectT
struct xf::common::utils_hw::LSBSideT
struct xf::common::utils_hw::MSBSideT
Module Design Internals
Internals of axiToStream
Internals of axiToMultiStream
Internals of streamToAxi
Internals of UramArray
Work Flow
Storage Layout
Resources
Internals of streamOneToN
Round-Robin
Generic Type
Vector Input
Load-Balancing
Generic Type
Vector Input
Tag-Select
Internals of streamNToOne
Round-Robin
Generic Type
Vector Output
Load-Balancing
Generic Type
Vector Output
Tag-Select
Internals of streamDiscard
Internals of streamSplit
Internals of streamCombine
Internals of streamSync
Internals of streamReorder
Examples
Vitis Vision Library
Vitis Vision Library User Guide
Overview
Basic Features
Vitis Vision Kernel on Vitis
Vitis Vision Library Contents
Getting Started with Vitis Vision
Prerequisites
Vitis Design Methodology
Host Code with OpenCL
Wrappers around HLS Kernel(s)
Stream Based Kernels
Array2xfMat
xfMat2Array
Interface pointer widths
Kernel-to-Kernel streaming
axiStrm2xfMat
xfMat2axiStrm
Memory Mapped Kernels
Makefile
Design example Using Library on Vitis
Host code
Top level kernel
Evaluating the Functionality
Using the Vitis vision Library
Changing the Hardware Kernel Configuration
Using the Vitis vision Library Functions on Hardware
Getting Started with HLS
AXI Video Interface Functions
AXIvideo2xfMat
xfMat2AXIvideo
cvMat2AXIvideoxf
AXIvideo2cvMatxf
Migrating HLS Video Library to Vitis vision
Infrastructure Functions and Classes
Classes
Funtions
xf::cv::window
Class definition
Parameter Descriptions
Member Function Description
Template Parameter Description
xf::cv::LineBuffer
Class definition
Parameter Descriptions
Member Functions Description
Template Parameter Description
Video Processing Functions
Design Examples Using Vitis Vision Library
Iterative Pyramidal Dense Optical Flow
Corner Tracking Using Optical Flow
cornerUpdate()
cornersImgToList()
Image Processing
Color Detection
Defect Detection Pipeline
pass_2()
Difference of Gaussian Filter
Stereo Vision Pipeline
Blob From Image
Letterbox
Image Sensor Processing pipeline
Image Sensor Processing pipeline with HDR
Image Sensor Processing pipeline with GTM
Mono image Sensor Processing pipeline
RGB-IR image Sensor Processing pipeline
Image Sensor Processing multistream pipeline
ISP all_in_one_adas pipeline
ISP all_in_one pipeline:
ISP 24-bit Pipeline
Vitis Vision AIE Library User Guide
Overview
Basic Features
Vitis Vision AIE Library Contents
Getting Started with Vitis Vision AIE
AIE Prerequisites
Vitis AIE Design Methodology
Prepare the Kernels
Data Flow Graph construction
Setting up platform ports
FileIO
PLIO
GMIO
Host code integration
x86Simulation / AIE simulation
HW emulation / HW run
xfcvDataMovers
Evaluating the Functionality
x86 Simulation
AIE Simulation
HW emulation
Testing on HW
Design example Using Vitis Vision AIE Library
ADF Graph
Platform Ports
Host code
Makefile
Filter2D Pipeline on Multiple AIE Cores
Executable Usage
Performance
Vitis Vision Library API Reference
Overview
xf::cv::Mat Image Container Class
Class Definition
Pixel-Level Parallelism
Macros to Work With Parallelism
Data Types
Manipulating Data Type
xf::cv::imread
xf::cv::imwrite
xf::cv::absDiff
xf::cv::convertTo
Vitis Vision Library Functions
Absolute Difference
Accumulate
Accumulate Squared
Accumulate Weighted
AddS
Add Weighted
Auto Exposure Correction
Auto White Balance
Bad Pixel Correction
Brute-force (Bf) Feature Matcher
Bilateral Filter
Bit Depth Conversion
Bitwise AND
Bitwise NOT
Bitwise OR
Bitwise XOR
Blacklevelcorrection
Box Filter
BoundingBox
Canny Edge Detection
Channel Combine
Channel Extract
Clahe
Color Conversion
RGB to YUV Conversion Matrix
YUV to RGB Conversion Matrix
RGBA/RGB to YUV4
RGBA/RGB to IYUV
RGBA to NV12
RGBA to NV21
YUYV to RGBA
YUYV to NV12
YUYV to IYUV
UYVY to IYUV
UYVY to NV12
IYUV to RGBA/RGB
IYUV to NV12
IYUV to YUV4
NV12 to IYUV
NV12 to RGBA
NV12 to YUV4
NV21 to IYUV
NV21 to RGBA
NV21 to YUV4
RGB to GRAY
BGR to GRAY
GRAY to RGB
GRAY to BGR
RGB to XYZ
BGR to XYZ
RGB/BGR to HSV
RGB/BGR to HLS
YCrCb to RGB/BGR
HSV to RGB/BGR
NV12/NV21 to RGB/ BGR
NV122RGB:
NV122BGR:
NV212RGB:
NV212BGR:
NV12 to NV21/NV21 to NV12
NV122NV21:
NV212NV12:
NV12/NV21 to UYVY/YUYV
NV122UYVY:
NV122YUYV:
NV212UYVY:
NV212YUYV:
UYVY/YUYV to RGB/BGR
YUYV2RGB:
YUYV2BGR:
UYVY2RGB
UYVY2BGR:
UYVY to YUYV/ YUYV to UYVY
UYVY2YUYV:
YUYV2UYVY:
UYVY/YUYV to NV21
UYVY2NV21:
YUYV2NV21:
RGB/ BGR to NV12/NV21
RGB2NV12
BGR2NV12
RGB2NV21
BGR2NV21
BGR to RGB / RGB to BGR
RGB/BGR to UYVY/YUYV
RGB to UYVY:
RGB to YUYV:
BGR to UYVY:
BGR to YUYV:
XYZ to RGB/BGR
Color correction matrix
Color Thresholding
Compare
CompareS
convertScaleAbs
Crop
Multiple ROI Extraction
Multiple ROI Extraction Example
CUSTOM BGR2Y8
Custom CCA
Custom Convolution
Delay
Degamma
Demosaicing
Dilate
Distance Transform Feature Matcher
Duplicate
Erode
FAST Corner Detection
Gaincontrol
Extract Exposure Frames
Flip
Gamma Correction
Global Tone Mapping
HDR Decompanding
HDR Merge
Gaussian Filter
Gradient Magnitude
Gradient Phase
Harris Corner Detection
Non-Maximum Suppression:
Threshold:
Histogram Computation
Histogram Equalization
HOG
HoughLines
Preprocessing for Deep Neural Networks
Pyramid Up
Pyramid Down
InitUndistortRectifyMapInverse
InRange
Integral Image
ISP Stats
Dense Pyramidal LK Optical Flow
Dense Non-Pyramidal LK Optical Flow
Kalman Filter
Extended Kalman Filter
Example for Extended Kalman Filter
Laplacian Operator
Lens Shading Correction
Local Tone Mapping
Look Up Table
Mean and Standard Deviation
Max
MaxS
Median Blur Filter
Min
MinS
MinMax Location
Mean Shift Tracking
Mode filter
Otsu Threshold
Paint Mask
Pixel-Wise Addition
Pixel-Wise Multiplication
Pixel-Wise Subtraction
Quantization & Dithering
Reduce
Remap
Resolution Conversion (Resize)
RGBIR to Standard Bayer Format
Rotate
BGR to HSV Conversion
Scharr Filter
Set
Sobel Filter
Semi Global Method for Stereo Disparity Estimation
Stereo Local Block Matching
SubRS
SubS
Sum
SVM
3D LUT
Thresholding
Atan2
Inverse (Reciprocal)
Square Root
TVL1 Optical flow
Warp Transform
Zero
Vitis Vision AIE Library API Reference
xfcvDataMovers
Class Definition
Vitis Vision AIE Library Functions API list with performance estimates
Benchmark
Canny Edge Detection
Executable Usage
Profiling
Harris Corner Detection
Executable Usage
Profiling
FAST Corner Detection
Executable Usage
Profiling
Kalman Filter
Executable Usage
Profiling
Dense Pyramidal LK Optical Flow
Executable Usage
Profiling
Corner Tracker
Executable Usage
Profiling
Color Detect
Executable Usage
Profiling
Gaussian Difference
Executable Usage
Profiling
Bilateral Filter
Executable Usage
Profiling
Stereo Local Block Matching
Executable Usage
Profiling
Image Sensor Processing (ISP) Pipeline
Executable Usage
Profiling
Release Notes
New features and functions
Known issues
Versions
Description:
Pseudoinverse take the way to calculate singular value decomposition first. After the number of iteration of SVD is reached, it will perform extra post-processing to generate the final result of pseudoinverse. Please take reference of document of SVD.