AOCL-LibMem - 5.0 English - 57404

AOCL User Guide (57404)

Document ID
57404
Release Date
2024-12-14
Version
5.0 English

12. AOCL-LibMem#

AOCL-LibMem is a Linux library of data movement and manipulation functions (such as memcpy() and strcpy()) highly optimized for AMD “Zen” micro-architecture. It has multiple implementations of each function, supporting AVX2, AVX512, and ERMS CPU features. The default choice is the best- fit implementation based on the underlying micro-architectural support for CPU features and instructions. It also supports tunable build under which a specific implementation can be chosen for mem* functions as per the application requirements with respect to alignments, instruction choice, and threshold values as tunable parameters.

This release of the AOCL-LibMem library supports the following functions:

  • memcpy

  • mempcpy

  • memmove

  • memset

  • memcmp

  • memchr

  • strcpy

  • strncpy

  • strcmp

  • strncmp

  • strlen

  • strcat

  • strstr

Note

  1. Behavior might be undefined if AVX512 is disabled in the BIOS configuration on the Zen5 platform.

12.1. Building AOCL-LibMem for Linux#

Minimum software requirements for compilation:

  • GCC 12.2

  • AOCC 4.0

  • Python 3.10

  • CMake 3.22

Complete the following steps to build AOCL-LibMem for Linux:

  1. Download and install the AOCL master installer (aocl-linux-<compiler>-<version>.tar.gz) from:

    https://www.amd.com/en/developer/aocl.html

  2. Locate the aocl-libmem folder in the root directory.

  3. Configure for one of the following builds as required:

    • GCC

      Default Native Build

      $ cmake -D CMAKE_C_COMPILER=gcc -S <source_dir> -B <build_dir>
      

      Cross Compiling AVX2 Binary on AVX512 Machine

      $ cmake -D CMAKE_C_COMPILER=gcc -D ALMEM_ARCH=avx2 -S <source_dir> -B <build_dir>
      

      Cross Compiling AVX512 Binary on AVX2 Machine

      $ cmake -D CMAKE_C_COMPILER=gcc -D ALMEM_ARCH=avx512 -S <source_dir> -B <build_dir>
      

      Enabling Tunable Parameters

      $ cmake -D CMAKE_C_COMPILER=gcc -D ENABLE_TUNABLES=Y -S <source_dir> -B <build_dir>
      
    • AOCC (Clang)

      Default Native Build

      $ cmake -D CMAKE_C_COMPILER=clang -S <source_dir> -B <build_dir>
      

      Cross Compiling AVX2 Binary on AVX512 Machine

      $ cmake -D CMAKE_C_COMPILER=clang -D ALMEM_ARCH=avx2 -S <source_dir> -B <build_dir>
      

      Cross Compiling AVX512 Binary on AVX2 Machine

      $ cmake -D CMAKE_C_COMPILER=clang -D ALMEM_ARCH=avx512 -S <source_dir> -B <build_dir>
      

      Enabling Tunable Parameters

      $ cmake -D CMAKE_C_COMPILER=clang -D ENABLE_TUNABLES=Y -S <source_dir> -B <build_dir>
      
  4. Build:

    $ cmake --build <build_dir>
    
  5. Install:

    $ cmake --install <build_dir>
    
    ## For custom install path, run configure with "CMAKE_INSTALL_PREFIX"
    

Note

Both shared (libaocl-libmem.so) and static (libaocl-libmem.a) library files are installed under <build_dir>/lib/ path.

Dynamic Dispatcher is not supported. Hence, it is recommended not to load/run the AVX512 library on a non-AVX512 machine as it will lead to crash due to unsupported instructions.

12.2. Running an Application#

The applications can preload the AOCL-LibMem shared library to replace the standard c library memory functions for better performance gains on AMD “Zen” micro-architectures.

To run the application, preload the libaocl-libmem.so generated from the build procedure above:

$ LD_PRELOAD=<path to build/lib/libaocl-libmem.so> <executable> <params>