AI Engine API User Guide (AIE) 2023.2
|
Loading...
Searching...
No Matches
Changelog
Table of Contents
Vitis 2023.2
Documentation changes
- Integrate AIE-ML documentation
- Document rounding modes
- Expand accumulate documentation
- Clarify limitations on 8b parallel lookup
- Fix mmul member documentation
- Clarify requirement for linear_approx step bits
- Improve documentation of vector, accum, and mask
- Highlight architecture requirements of functions using C++ requires clauses
- Document FFT twiddle factor generation
- Clarify internal rounding mode for bfloat16 to integer conversion
- Clarify native and emulated modes for mmul
- Clarify native and emulated modes for sliding_mul
- Document sparse_vector_input_buffer_stream with memory layout and GEMM example
- Document tensor_buffer_stream with a GEMM example
Global AIE API changes
- Add cfloat support for AIE-ML
Changes to data types
- vector: Optimize grow_replicate on AIE-ML
- mmul: Support reinitialization from an accum
- DM resources: Add compound aie_dm_resource variants
- streams: Add sparse_vector_input_buffer_stream for loading sparse data on AIE-ML
- streams: Add tensor_buffer_stream to handle multi-dimensional addressing for AIE-ML
- bfloat16: Add specialization for std::numeric_limits on AIE-ML
Changes to operations
- abs: Fix for float input
- add_reduce: Optimize for 8b and 16b types on AIE-ML
- div: Implement vector-vector and vector-scalar division
- downshift: Implement logical_downshift for AIE
- fft: Add support for 32 bit twiddles on AIE
- fft: Fix for radix-3 and radix-5 FFTs on AIE
- fft: Fix radix-5 performance for low vectorizations on AIE
- fft: Add stage-based FFT functions and deprecate iterator interface
- mul: Fix for vector * vector_elem_ref on AIE
- print_fixed: Support printing Q format data
- print_matrix: Added accumulator support
- sliding_mul: Add support float
- sliding_mul: Add support for remaining 32b modes for AIE-ML
- sliding_mul: Add support for Points < Native Points
- sliding_mul_ch: Fix DataStepX == DataStepY requirement
- sincos: Optimize AIE implementation
- to_fixed: Fix for AIE-ML
- to_fixed/to_float: Add vectorized float conversions for AIE
- to_fixed/to_float: Add generic conversions ((int8, int16, int32) <-> (bfloat16, float)) for AIE-ML
ADF integration
- Add TLAST support for stream reads on AIE-ML
- Add support for input_cascade and output_cascade types
- Deprecate accum reads from input_stream and output_stream
Vitis 2023.1
Documentation changes
- Add explanation of FFT inputs
- Use block_size in FFT docs
- Clarify matrix data layout expectations
- Clarify downshift being arithmetic
- Correct description of bfloat16 linear_approx lookup table
Global AIE API changes
- Do not explicitly initialize inferred template arguments
- More aggressive inlining of internal functions
- Avoid using 128b vectors in stream helper functions for AIE-ML
Changes to data types
- iterator: Do not declare iterator data members as const
- mask: Optimized implementation for 64b masks on AIE-ML
- mask: New constructors added to initialize the mask from uint32 or uint64 values
- vector: Fix 1024b inserts
- vector: Use 128b concats in upd_all
- vector: Fix 8b unsigned to_vector for AIE-ML
Changes to operations
- add/sub: Support for dynamic accumulator zeroization
- begin_restrict_vector: Add implementation for io_buffer
- eq: Add support for complex numbers
- fft: Correctly set radix configuration in fft::begin_stage calls
- inv/invsqrt: Add implementation for AIE-ML
- linear_approx: Performance optimization for AIE-ML
- logical_downshift: New function that implements a logical downshift (as opposed to aie::downshift, which is arithmetic)
- max/min/maxdiff: Add support for dynamic sign
- mmul: Implement 16b 8x2x8 mode for AIE-ML
- mmul: Implement 8b 8x8x8 mode for AIE-ML
- mmul: Implemet missing 16b x 8b and 8b x 4b sparse multiplication modes for AIE-ML
- neq: Add support for complex numbers
- parallel_lookup: Optimize implementation for signed truncation
- print_matrix: New function that prints vectors with the specified matrix shape
- shuffle_up/down: Minor optimization for 16b
- shuffle_up/down: Optimized implementation for AIE-ML
- sliding_mul: Support data_start/coeff_start values larger than vector size
- sliding_mul: Add support for 32b modes for AIE-ML
- sliding_mul: Add 2 point 16b 16 channel for AIE-ML
- sliding_mul_ch: New function for multi-channel multiplication modes for AIE-ML
- sliding_mul_sym_uct: Fix for 16b two-buffer implementation
- store_unaligned_v: Optimized implementation for AIE-ML
- transpose: Add support for 64b and 32b types
- transpose: Enable transposition of 256 element 4b vectors (scalar implementation for now)
- to_fixed: Add bfloat16 to int32 conversion on AIE-ML
Vitis 2022.2
Documentation changes
- Add code samples for load_v/store_v and load_unaligned_v/store_unaligned_v
- Enhanced documentation for parallel_lookup and linear_approx
- Clarify coeff vector size limit on AIE-ML
Global AIE API changes
- Remove usage of srs in compare functions, to avoid compilation warnings as it is deprecated
- Add support for stream ADF vector types on AIE-ML
Changes to data types
- mask: add shift operators
- saturation_mode: add saturate value. It was previously named truncate, which is not correct. The old name is also kept until it is deprecated
Changes to operations
- add: support accumulator addition on AIE-ML
- add_reduce: add optimized implementation for cfloat on AIE
- add_reduce: add optimized implementation for bfloat16 on AIE-ML
- eq/neq: enhanced implementation on AIE-ML
- le: enhanced implementation on AIE-ML
- load_unaligned_v: leverage pointer truncation to 128b done by HW on AIE
- fft: add support for radix 3/5 on AIE
- mmul: add matrix x vector multiplicatio modes on AIE
- mmul: add support for dynamic accumulator zeroization
- to_fixed: added implementation for AIE-ML
- to_fixed: provide a default return type
- to_float: added implementation for AIE-ML
- reverse: optimized implementation for 32b and 64b on AIE-ML
- zeros: include fixes on AIE
Vitis 2022.1
Documentation changes
- Small documentation fixes for operators
- Issues of documentation on msc_square and mmul
- Enhance documentation for sliding_mul operations
- Change logo in documentation
- Add documentation for ADF stream operators
Global AIE API changes
- Add support for emulated FP32 data types and operations on AIE-ML
Changes to data types
- unaligned_vector_iterator: add new type and helper functions
- random_circular_vector_iterator: add new type and helper functions
- iterator: add linear iterator type and helper functions for scalar values
- accum: add support for dynamic sign in to/from_vector on AIE-ML
- accum: add implicit conversion to float on AIE-ML
- vector: add support for dynamic sign in pack/unpack
- vector: optimization of initialization by value on AIE-ML
- vector: add constructor from 1024b native types on AIE-ML
- vector: fixes and optimizations for unaligned_load/store
Changes to operations
- adf::buffer_port: add many wrapper iterators
- adf::stream: annotate read/write functions with stream resource so they can be scheduled in parallel
- adf::stream: add stream operator overloading
- fft: performance fixes on AIE-ML
- max/min/maxdiff: add support for bfloat16 and float on AIE-ML
- mul/mmul: add support for bfloat16 and float on AIE-ML
- mul/mmul: add support for dynamic sign AIE-ML
- parallel_lookup: expanded to int16->bfloat, performance optimisations, and softmax kernel
- print: add support to print accumulators
- add/max/min_reduce: add support for float on AIE-ML
- reverse: add optimized implementation on AIE-ML using matrix multiplications
- shuffle_down_replicate: add new function
- sliding_mul: add 32b for 8b * 8b and 16b * 16b on AIE-ML
- transpose: add new function and implementation for AIE-ML
- upshift/downshift: add implementation for AIE-ML
Vitis 2021.2
Documentation changes
- Fix description of sliding_mul_sym_uct
- Make return types explicit for better documentation
- Fix documentation for sin/cos so that it says that the input must be in radians
- Add support for concepts
- Add documenttion for missing arguments and fix wrong argument names
- Fixes in documentation for int4/uint4 AIE-ML types
- Add documentation for the mmul class
- Update documentation about supported accumulator sizes
- Update the matrix multiplication example to use the new MxKxN scheme and size_A/size_B/size_C
Global AIE API changes
- Make all entry points always_inline
- Add declaration macros to aie_declaration.hpp so that they can be used in headers parsed by aiecompiler
Changes to data types
- Add support for bfloat16 data type on AIE-ML
- Add support for cint16/cint32 data types on AIE-ML
- Add an argument to vector::grow, to specify where the input vector will be located in the output vector
- Remove copy constructor so that the vector type becomes trivial
- Remove copy constructor so that the mask type becomes trivial
- Make all member functions in circular_index constexpr
- Add tiled_mdspan::begin_vector_dim functions that return vector iterators
- Initial support for sparse vectors on AIE-ML, including iterators to read from memory
- Make vector methods always_inline
- Make vector::push be applied to the object it is called on and return a reference
Changes to operations
- add: Implementation optimization on AIE-ML
- add_reduce: Implement on AIE-ML
- bit/or/xor: Implement scalar x vector variants of bit operations
- equal/not_equal: Add fix in which not all lanes were being compared for certain vector sizes.
- fft: Interface change to enhance portability across AIE/AIE-ML
- fft: Add initial support on AIE-ML
- fft: Add alignment checks for x86sim in FFT iterators
- fft: Make FFT output interface uniform for radix 2 cint16 upscale version on AIE
- filter_even/filter_odd: Functional fixes
- filter_even/filter_odd: Performance improvement for 4b/8b/16b implementations
- filter_even/filter_odd: Performance optimization on AIE-ML
- filter_even/filter_odd: Do not require step argument to be a compile-time constant
- interleave_zip/interleave_unzip: Improve performance when configuration is a run-time value
- interleave_*: Do not require step argument to be a compile-time constant
- load_floor_v/load_floor_bytes_v: New functions that floor the pointer to a requested boundary before performing the load.
- load_unaligned_v/store_unaligned_v: Performance optimization on AIE-ML
- lut/parallel_lookup/linear_approx: First implementation of look-up based linear functions on AIE-ML.
- max_reduce/min_reduce: Add 8b implementation
- max_reduce/min_reduce: Implement on AIE-ML
- mmul: Implement new shapes for AIE-ML
- mmul: Initial support for 4b multiplication
- mmul: Add support for 80b accumulation for 16b x 32b / 32b x 16b cases
- mmul: Change dimension names from MxNxK to MxKxN
- mmul: Add size_A/size_B/size_C data members
- mul: Optimized mul+conj operations to merged into a single intrinsic call on AIE-ML
- sin/cos/sincos: Fix to avoid int -> unsigned conversions that reduce the range
- sin/cos/sincos: Use a compile-time division to compute 1/PI
- sin/cos/sincos: Fix floating-point range
- sin/cos/sincos: Optimized implementation for float vector
- shuffle_up/shuffle_down: Elements don't wrap around anymore. Instead, new elements are undefined.
- shuffle_up_rotate/shuffle_down_rotate: New variants added for the cases in which elements need to wrap-around
- shuffle_up_replicate: Variant added which replicates the first element.
- shuffle_up_fill: Variant added which fills new elements with elements from another vector.
- shuffle_*: Optimization in shuffle primitives on AIE, especially for 8b/16b cases
- sliding_mul: Fixes to handle larger Step values for cfloat variants
- sliding_mul: Initial implementation for 16b x 16b and cint16b x cint16b on AIE-ML
- sliding_mul: Optimized mul+conj operations to merged into a single intrinsic call on AIE-ML
- sliding_mul_sym: Fixes in start computation for filters with DataStepX > 1
- sliding_mul_sym: Add missing int32 x int16 / int16 x int32 type combinations
- sliding_mul_sym: Fix two-buffer sliding_mul_sym acc80
- sliding_mul_sym: Add support for separate left/right start arguments
- store_v: Support pointers annotated with storage attributes