Introduction - 5.0 English - 57404

AOCL User Guide (57404)

Document ID
57404
Release Date
2024-12-14
Version
5.0 English

1. Introduction#

AMD Optimizing CPU Libraries (AOCL) are a set of numerical libraries optimized for AMD “Zen”- based processors, including EPYCTM, RyzenTM ThreadripperTM, and RyzenTM. This document provides instructions on installing and using all the AMD optimized libraries.

AOCL is comprised of the following libraries:

  • AOCL-BLAS is a portable software framework for performing high-performance Basic Linear Algebra Subprograms (BLAS) functionality.

  • AOCL-LAPACK is a portable library for dense matrix computations that provides the functionality present in the Linear Algebra Package (LAPACK).

  • AOCL-FFTW (Fastest Fourier Transform in the West) is a comprehensive collection of fast C routines for computing the Discrete Fourier Transform (DFT) and various special cases.

  • AOCL-LibM is a software library containing elementary math functions optimized for x86-64 processor based machines.

  • AOCL-Utils is a library which provides APIs to check the available CPU features/flags, cache topology, and so on of AMD “Zen”-based CPUs.

  • AOCL-ScaLAPACK is a library of high-performance linear algebra routines for parallel distributed memory machines. It depends on external libraries including BLAS and LAPACK for linear algebra computations.

  • AOCL-RNG is a library that provides a set of pseudo-random number generators, quasi-random number generator and statistical distribution functions optimized for AMD “Zen”-based processors.

  • AOCL-SecureRNG is a library that provides APIs to access the cryptographically secure random numbers generated by the AMD hardware random number generator.

  • AOCL-Sparse is a library containing the basic linear algebra subroutines for sparse matrices and vectors optimized for AMD “Zen”-based CPUs.

  • AOCL-LibMem is AMD’s optimized implementation of memory manipulation functions for AMD “Zen”-based CPUs.

  • AOCL-Cryptography is AMD’s optimized implementation of cryptographic functions.

  • AOCL-Compression is a software framework of various lossless data compression and decompression methods tuned and optimized for AMD “Zen”-based CPUs.

  • AOCL-DA is a data analytics library providing optimized building blocks for data analysis and classical machine learning problems.

All the above libraries are open-source except AOCL-RNG.

1.1. Feature Support Matrix#

Following tables summarize the list of supported features and dependencies for the AOCL libraries:

Table 1.1 AOCL Feature Support Matrix - 1#

Library/Feature

Vector

Dynamic Dispatcher

Precision

ISA Optional Selection by the User

AOCL-BLAS

AVX2, AVX512

Yes

Single, Double, Complex, Double Complex.

Mixed Precision, and Low Precision (INT16, INT8, UINT8, BFLOAT16) - currently supported only for GEMM and GEMV APIs.

Yes, using AOCL_ENABLE_ INSTRUCTION

AOCL-LAPACK

AVX2, AVX512

Partially (requires AVX2 support)

Single, Double, Complex, Double Complex

Yes, using AOCL_ENABLE_ INSTRUCTION

AOCL-FFTW

AVX2, AVX512

Yes for Linux with GCC and AOCC.

No for Windows with Clang. MSVC compiler has not been used on Windows.

Single, Double, Long-double, Quad

No

AOCL-LibM

AVX2, AVX512

Yes

Single, Double, Complex, Double Complex

No

AOCL-Sparse

AVX2, AVX512

Partial (only for selected APIs)

Single, Double, Complex, Double Complex

Partial (only for selected APIs) using AOCL_ENABLE_ INSTRUCTION

AOCL-Cryptography

AVX2, AVX512

Yes, GCC and AOCC on Linux; Clang on Windows.

N/A

No

AOCL-Compression

AVX2

AVX512 instructions have not been used. But, library can be built with -mavx512f compiler option.

Yes, GCC and AOCC on Linux; Clang on Windows.

N/A

Yes, using AOCL_ENABLE_ INSTRUCTION

AOCL-RNG

AVX2, AVX512

Yes

Single, Double, Integer

Yes, using AOCL_ENABLE_ INSTRUCTION

AOCL-SecureRNG

N/A

N/A

N/A

N/A

AOCL-ScaLAPACK

Dependent on the underlying BLAS and LAPACK libraries

Dependent on the underlying BLAS and LAPACK libraries

Single, Double, Complex, Double Complex

Dependent on the underlying BLAS and LAPACK libraries

AOCL-LibMem

AVX2, AVX512

No

N/A

No

AOCL-Utils

N/A

N/A

N/A

N/A

AOCL-DA

Dependent on the underlying BLAS library

No

Single, Double

Dependent on the underlying BLAS library

Table 1.2 AOCL Feature Support Matrix - 2#

Library/Feature

glibc Dependency

Single-threaded

Multi-threaded

MPI

AOCL-BLAS

Yes

Yes

Yes

No

AOCL-LAPACK

Yes

Yes

Yes

No

AOCL-FFTW

Yes

Yes

Yes

Yes

AOCL-LibM

Yes

Yes

No

No

AOCL-Sparse

Yes

Yes

Partial (only for selected APIs)

No

AOCL-Cryptography

Yes

Yes

No

No

AOCL-Compression

Yes

Yes

Partial (for LZ4, LZ4HC, Snappy, ZLIB, and ZSTD)

No

AOCL-RNG

Yes

Yes

No

No

AOCL-SecureRNG

No

Yes

No

No

AOCL-ScaLAPACK

Yes

Yes, dependent on the underlying BLAS and LAPACK libraries

Yes, dependent on the underlying BLAS and LAPACK libraries

Yes

AOCL-LibMem

Yes

Yes

No

No

AOCL-Utils

Yes

Yes

No

No

AOCL-DA

Yes

Yes

Yes

No

Dynamic Dispatch facilitates building a single binary compatible with all the AMD “Zen” architectures. At runtime, this feature enables optimizations specific to the detected AMD “Zen” architecture.

You can find the flags to enable/disable (the applicable features in AOCL Feature Support Matrix - 1 and AOCL Feature Support Matrix - 2) in the individual library sections.

Additionally, AMD provides Spack (https://spack.io/) recipes for installing AOCL-BLAS, AOCL-LAPACK, AOCL-ScaLAPACK, AOCL-LibM, AOCL-FFTW, AOCL-Sparse, AOCL-Compression, AOCL-Cryptography, AOCL-DA, and AOCL-Utils libraries.

For more information on the AOCL release and installers, refer the AMD Developer Central (https://www.amd.com/en/developer/aocl.html).

For any issues or queries on the libraries, send an email to toolchainsupport@amd.com.

To determine the underlying architecture of your AMD system, refer to Check AMD Server Processor Architecture.