11.1.2. Building from Source on Linux - 5.2 English - 57404

AOCL User Guide (57404)

Document ID
57404
Release Date
2025-12-29
Version
5.2 English

AOCL-LAPACK is built using the CMake build system. Starting with AOCL 5.2, the Autoconf-based workflow (configure and Makefile) is no longer supported.

Here are the steps to compile AOCL-LAPACK from the source:

  1. Clone the Git repository (amd/libflame.git).

  2. Compile AOCL-LAPACK source.

Using CMake

AOCL-LAPACK can be linked with any Netlib BLAS compliant library when compiled with standard CMake options. However, AOCL-LAPACK provides an option to explicitly link with AOCL-BLAS library at compile time. This option enables tighter coupling between AOCL-LAPACK and AOCL-BLAS by invoking lower level AOCL-BLAS APIs directly and that could result in better performance for certain APIs on AMD “Zen” CPUs. It is recommended to link with AOCL-BLAS library by providing the option ENABLE_AOCL_BLAS in the CMake configuration as described in steps below.

  1. Create a new build directory, for example, newbuild:

    $ mkdir newbuild
    $ cd newbuild
    
  2. Set AOCL_ROOT to specify the root path of AOCL-BLAS and AOCL-Utils libraries using one of these methods:

    • Using environment variable: Set AOCL_ROOT environment variable to point to a directory containing:

      • include directory with necessary header files

      • lib directory with necessary binaries

    • Using CMake option: Specify the path through the AOCL_ROOT CMake option when configuring the build (for example, -DAOCL_ROOT=<path to AOCL-BLAS and AOCL-Utils>)

  3. Run the following command to configure the project:

    • With GCC (default):

      Set AOCL_ROOT using one of the methods mentioned in Step 2
      
      **Using 32-bit Integer (LP64)**
      
      $ cmake ../ -DENABLE_AMD_FLAGS=ON -DENABLE_AOCL_BLAS=ON
      -DCMAKE_INSTALL_PREFIX=<your-install-dir>
      
      **Using 64-bit Integer (ILP64)**
      
      $ cmake ../ -DENABLE_ILP64=ON -DENABLE_AMD_FLAGS=ON
      -DENABLE_AOCL_BLAS=ON -DCMAKE_INSTALL_PREFIX=<your-install-dir>
      
    • With AOCC:

      $ export CC=clang
      $ export FC=flang
      $ export FLIBS="-lflang"
      
      Set AOCL_ROOT using one of the methods mentioned in Step 2
      
      **Using 32-bit Integer (LP64)**
      
      $ cmake ../ -DENABLE_AMD_AOCC_FLAGS=ON -DENABLE_AOCL_BLAS=ON
      -DCMAKE_INSTALL_PREFIX=<your-install-dir>
      
      **Using 64-bit Integer (ILP64)**
      
      $ cmake ../ -DENABLE_ILP64=ON -DENABLE_AMD_AOCC_FLAGS=ON
      -DENABLE_AOCL_BLAS=ON -DCMAKE_INSTALL_PREFIX=<your-install-dir>
      

    If the AOCL_ROOT path contains only the AOCL-BLAS library, specify the AOCL-Utils library path separately using the ‘LIBAOCLUTILS_LIBRARY_PATH’ and ‘LIBAOCLUTILS_INCLUDE_PATH’ flags for the library and header file paths, respectively.

    Shared library is turned on by default. To generate static library, provide the additional option:

    -DBUILD_SHARED_LIBS=OFF
    
  4. Compile and install the library:

    $ cmake --build . -j
    

    This will generate libflame.a or libflame.so library in the lib directory.

    To install the library, run:

    $ cmake --install .
    

    Note

    1. AOCL-LAPACK depends on AOCL Utilities library (AOCL-Utils) for certain functions including CPU architecture detection at runtime. Applications using AOCL-LAPACK must link with the AOCL-Utils library explicitly.

For more information on the detailed steps to build and install AOCL-LAPACK including additional configuration options for tests, documentation and more please refer BUILD.md file under source root.

Optimized configuration flags for AMD Zen CPUs

For optimal performance on AMD “Zen”-based CPUs, use these recommended build flags:

  • For GCC: Enable ENABLE_AMD_FLAGS

  • For AOCC: Enable ENABLE_AMD_AOCC_FLAGS

These flags automatically enable important optimizations including:

  • ENABLE_AMD_OPT for AMD-specific optimizations

  • ENABLE_MULTITHREADING for OpenMP parallelization

  • ENABLE_BLAS_EXT_GEMMT for extended BLAS operations

  • ENABLE_EXT_LAPACK_INTERFACE for standard LAPACK interfaces

On Windows, both flags additionally enable:

  • ENABLE_F2C_DOTC

  • ENABLE_VOID_RETURN_COMPLEX_FUNCTION

These Windows-specific flags handle complex return values correctly with compilers that treat them as hidden arguments rather than return values.

Additionally, setting ENABLE_AOCL_BLAS=ON is recommended to optimize integration with the AOCL-BLAS library through tight coupling.

Additional Notes on Configuration Options

  1. By default, the configuration options ENABLE_AMD_FLAGS and ENABLE_AMD_AOCC_FLAGS enable multi-threading using OpenMP for select APIs in AOCL-LAPACK. To get maximum performance in multi-threading, set environment variable OMP_PROC_BIND=TRUE. To disable multi-threading, use the configure option ENABLE_MULTITHREADING=OFF.

    Example:

    $ cmake ../ -DENABLE_AMD_FLAGS=ON -DENABLE_MULTITHREADING=OFF
    -DENABLE_AOCL_BLAS=ON -DCMAKE_INSTALL_PREFIX=<your-install-dir>
    
  2. To provide good default performance across different architectures, default compiler flags are set to -mtune=native -mavx2 -mfma -O3.

    This requires AVX2 and Fused Multiply Accumulate (FMA) support from the target CPU as mentioned in the Prerequisites section.

    For enabling further optimizations, such as enabling AVX512, use the following steps:

    Set the flag LF_ISA_CONFIG to the desired ISA support. The available options are Auto, AVX2 (default), AVX512, AVX2-STRICT, AVX512-STRICT and None. The command to use this is as follows:

    $ cmake .. -DLF_ISA_CONFIG=AVX512 -DENABLE_AMD_FLAGS=ON
    

    Ensure that the compiler you use supports ‘znver4’ flag.

    Note

    LF_ISA_CONFIG CMake flag includes strict ISA options to fix the ISA path of hand-written optimizations that would otherwise follow the dynamic dispatch behavior. The strict options are:

    • avx2-strict: Forces all code paths including optimized kernels to use AVX2 ISA

    • avx512-strict: Forces all code paths including optimized kernels to use AVX512 ISA

    Example to force AVX2 ISA:

    $ cmake .. -DLF_ISA_CONFIG=avx2-strict -DENABLE_AMD_FLAGS=ON
    

    This is useful in scenarios where you want to restrict the ISA to a specific level for floating point numerical consistency across different CPU models.