AOCL-LAPACK is built using the CMake build system. Starting with AOCL 5.2, the Autoconf-based workflow (configure and Makefile) is no longer supported.
Here are the steps to compile AOCL-LAPACK from the source:
Clone the Git repository (amd/libflame.git).
Compile AOCL-LAPACK source.
Using CMake
AOCL-LAPACK can be linked with any Netlib BLAS compliant library when compiled with standard CMake options. However, AOCL-LAPACK provides an option to explicitly link with AOCL-BLAS library at compile time. This option enables tighter coupling between AOCL-LAPACK and AOCL-BLAS by invoking lower level AOCL-BLAS APIs directly and that could result in better performance for certain APIs on AMD “Zen” CPUs. It is recommended to link with AOCL-BLAS library by providing the option ENABLE_AOCL_BLAS in the CMake configuration as described in steps below.
Create a new build directory, for example, newbuild:
$ mkdir newbuild $ cd newbuild
Set
AOCL_ROOTto specify the root path of AOCL-BLAS and AOCL-Utils libraries using one of these methods:Using environment variable: Set
AOCL_ROOTenvironment variable to point to a directory containing:includedirectory with necessary header fileslibdirectory with necessary binaries
Using CMake option: Specify the path through the
AOCL_ROOTCMake option when configuring the build (for example,-DAOCL_ROOT=<path to AOCL-BLAS and AOCL-Utils>)
Run the following command to configure the project:
With GCC (default):
Set AOCL_ROOT using one of the methods mentioned in Step 2 **Using 32-bit Integer (LP64)** $ cmake ../ -DENABLE_AMD_FLAGS=ON -DENABLE_AOCL_BLAS=ON -DCMAKE_INSTALL_PREFIX=<your-install-dir> **Using 64-bit Integer (ILP64)** $ cmake ../ -DENABLE_ILP64=ON -DENABLE_AMD_FLAGS=ON -DENABLE_AOCL_BLAS=ON -DCMAKE_INSTALL_PREFIX=<your-install-dir>
With AOCC:
$ export CC=clang $ export FC=flang $ export FLIBS="-lflang" Set AOCL_ROOT using one of the methods mentioned in Step 2 **Using 32-bit Integer (LP64)** $ cmake ../ -DENABLE_AMD_AOCC_FLAGS=ON -DENABLE_AOCL_BLAS=ON -DCMAKE_INSTALL_PREFIX=<your-install-dir> **Using 64-bit Integer (ILP64)** $ cmake ../ -DENABLE_ILP64=ON -DENABLE_AMD_AOCC_FLAGS=ON -DENABLE_AOCL_BLAS=ON -DCMAKE_INSTALL_PREFIX=<your-install-dir>
If the AOCL_ROOT path contains only the AOCL-BLAS library, specify the AOCL-Utils library path separately using the ‘LIBAOCLUTILS_LIBRARY_PATH’ and ‘LIBAOCLUTILS_INCLUDE_PATH’ flags for the library and header file paths, respectively.
Shared library is turned on by default. To generate static library, provide the additional option:
-DBUILD_SHARED_LIBS=OFF
Compile and install the library:
$ cmake --build . -j
This will generate
libflame.a or libflame.solibrary in thelibdirectory.To install the library, run:
$ cmake --install .
Note
1. AOCL-LAPACK depends on AOCL Utilities library (AOCL-Utils) for certain functions including CPU architecture detection at runtime. Applications using AOCL-LAPACK must link with the AOCL-Utils library explicitly.
For more information on the detailed steps to build and install AOCL-LAPACK including additional configuration options for tests, documentation and more please refer BUILD.md file under source root.
Optimized configuration flags for AMD Zen CPUs
For optimal performance on AMD “Zen”-based CPUs, use these recommended build flags:
For GCC: Enable
ENABLE_AMD_FLAGSFor AOCC: Enable
ENABLE_AMD_AOCC_FLAGS
These flags automatically enable important optimizations including:
ENABLE_AMD_OPT for AMD-specific optimizations
ENABLE_MULTITHREADING for OpenMP parallelization
ENABLE_BLAS_EXT_GEMMT for extended BLAS operations
ENABLE_EXT_LAPACK_INTERFACE for standard LAPACK interfaces
On Windows, both flags additionally enable:
ENABLE_F2C_DOTC
ENABLE_VOID_RETURN_COMPLEX_FUNCTION
These Windows-specific flags handle complex return values correctly with compilers that treat them as hidden arguments rather than return values.
Additionally, setting ENABLE_AOCL_BLAS=ON is recommended to optimize
integration with the AOCL-BLAS library through tight coupling.
Additional Notes on Configuration Options
By default, the configuration options ENABLE_AMD_FLAGS and ENABLE_AMD_AOCC_FLAGS enable multi-threading using OpenMP for select APIs in AOCL-LAPACK. To get maximum performance in multi-threading, set environment variable OMP_PROC_BIND=TRUE. To disable multi-threading, use the configure option ENABLE_MULTITHREADING=OFF.
Example:
$ cmake ../ -DENABLE_AMD_FLAGS=ON -DENABLE_MULTITHREADING=OFF -DENABLE_AOCL_BLAS=ON -DCMAKE_INSTALL_PREFIX=<your-install-dir>
To provide good default performance across different architectures, default compiler flags are set to
-mtune=native -mavx2 -mfma -O3.This requires AVX2 and Fused Multiply Accumulate (FMA) support from the target CPU as mentioned in the Prerequisites section.
For enabling further optimizations, such as enabling AVX512, use the following steps:
Set the flag LF_ISA_CONFIG to the desired ISA support. The available options are Auto, AVX2 (default), AVX512, AVX2-STRICT, AVX512-STRICT and None. The command to use this is as follows:
$ cmake .. -DLF_ISA_CONFIG=AVX512 -DENABLE_AMD_FLAGS=ON
Ensure that the compiler you use supports ‘znver4’ flag.
Note
LF_ISA_CONFIG CMake flag includes strict ISA options to fix the ISA path of hand-written optimizations that would otherwise follow the dynamic dispatch behavior. The strict options are:
avx2-strict: Forces all code paths including optimized kernels to use AVX2 ISA
avx512-strict: Forces all code paths including optimized kernels to use AVX512 ISA
Example to force AVX2 ISA:
$ cmake .. -DLF_ISA_CONFIG=avx2-strict -DENABLE_AMD_FLAGS=ON
This is useful in scenarios where you want to restrict the ISA to a specific level for floating point numerical consistency across different CPU models.