AOCL-Data Analytics - 5.0 English - 57404

AOCL User Guide (57404)

Document ID
57404
Release Date
2024-12-14
Version
5.0 English

16. AOCL-Data Analytics#

AOCL-Data Analytics (AOCL-DA) provides optimized building blocks for data analysis and classical machine learning. The intended workflow for using AOCL-DA is as follows:

  1. Load data from memory by reading in CSV files or using the in-built da_datastore object.

  2. Preprocess the data by removing missing values, standardizing and selecting certain subsets of the data.

  3. Perform a data analysis computation. APIs are available for the following data processing computations:

    • linear, ridge, lasso and logistic regression

    • decision tree and random forests

    • k-means clustering

    • k-nearest neighbors classification

    • principal component analysis

    • nonlinear least-squares data fitting

    • basic statistics

AOCL-DA is written with a C-compatible API to facilitate calling the library from different programming languages.

A Python API is also provided, along with a scikit-learn patch, so that users with existing scikit-learn workflows can leverage the performance of AOCL-DA with minimal changes to their code.

AOCL-DA depends on external libraries including BLAS and LAPACK for linear algebra computations.

This chapter contains details on how to install AOCL-DA, how to build AOCL-DA from source, and how to compile and link programs that use AOCL-DA APIs. For full documentation, please refer to the AOCL-DA html pages:

https://docs.amd.com/go/en-US/63863-AOCL-data-analytics

16.1. Installation#

The easiest way to access AOCL-DA is to use pre-built binaries from the packages available at the following URL:

https://www.amd.com/en/developer/aocl/data-analytics.html

You can also install the AOCL-DA binary from the AOCL master installer tar file available at the following URL:

https://www.amd.com/en/developer/aocl.html

To access the AOCL-DA Python APIs, the following command should be used to install the Python wheel:

$ pip install <path to aocl-da wheel>/aoclda-*.whl

where * will vary depending on your platform.

Note

The pre-built libraries are prepared on a specific platform with dependencies related to the operating system, including other AOCL libraries, the compiler (GCC, Clang), Visual Studio, and glibc. Your platform must adhere to the same versions of these dependencies to use those libraries. In particular, AOCL-BLAS and AOCL-LAPACK target CPU architectures must be met (see the AOCL-BLAS and AOCL-LAPACK chapters for more information). Additionally, Python support on Windows is currently experimental. A Fortran runtime library libifcoremd.lib is required, so you will need to install the Intel Fortran compiler and set the environment variable FORTRAN_RUNTIME to point to the directory containing the corresponding DLL. You may also need to install an OpenMP runtime and add it to your Windows environment.

16.1.1. Building from Source#

To build from source, the following are required:

  • git

  • CMake, at least version 3.22

  • An installation of AOCL-Utils, AOCL-BLAS, AOCL-LAPACK and AOCL-Sparse

  • Supported C, C++ and Fortran compilers (the minimum required GCC version is 12.2; the minimum required AOCC compiler version is 4.1)

  • A Python interpreter, at least version 3.8

The following steps apply to both Windows and Linux builds.

  1. Clone the Git repository (amd/aocl-data-analytics.git).

  2. Set the environment variable AOCL_ROOT to point to your installation of the prerequisite AOCL libraries (note that CMake build options are available if your libraries are installed elsewhere).

  3. For Python builds, install the required packages using

    $ pip install -r <path to>/python_interface/requirements.txt
    

16.1.1.1. Building from Source on Linux#

On Linux, the GCC and AOCC compilers are supported. The following steps are required to build AOCL-DA from source.

  1. Run the CMake configure step with the options detailed in the CMake build options section below.

  2. Run the CMake build step (note that --target install is required in order to build the Python wheel).

  3. Optionally, navigate to and install the Python wheel.

16.1.1.2. Building from Source on Windows#

On Windows, the MSVC compiler is supported along with the Intel Fortran compiler. The following steps are required to build AOCL-DA from source.

  1. In an MSVC terminal use C:\Program Files (x86)\Intel\oneAPI\setvars.bat to enable the Intel Fortran compiler to be found.

  2. For Python builds, set the environment variable CMAKE_PREFIX_PATH to point to the folder site-packages\pybind11\share\cmake\pybind11 within your Python installation.

  3. Run the CMake configure step using one of the following commands together with any of the options in the table in the next section:

    • cmake .. -DCMAKE_Fortran_COMPILER=ifort to use default MSVC CMake options

    • cmake -T ClangCL -DCMAKE_Fortran_COMPILER=ifort for the clang-cl compatibility layer

    • cmake -G Ninja -DCMAKE_C_COMPILER=clang-cl -DCMAKE_CXX_COMPILER=clang-cl -DCMAKE_Fortran_COMPILER=ifort ..

  4. Run the CMake build step using either

    • devenv .\AOCL-DA.sln /build "Debug" or "Release" for using MSVC build system

    • cmake -build . -target all/install for e.g. Ninja

  5. Optionally, install the Python wheel.

16.1.1.3. CMake Build Options#

Build Option

Feature

BUILD_ILP_64

  • ON: build with 64-bit integers

  • OFF (default): build with 32-bit integers

BUILD_SMP

  • ON (default): build with OpenMP and link to threaded BLAS and LAPACK libraries

  • OFF: build in serial and link to serial BLAS and LAPACK

BUILD_EXAMPLES

  • ON (default): compile the example programs

  • OFF: ignore the example programs

BUILD_GTEST

  • ON (default): compile the test programs

  • OFF: ignore the test programs

  • Note: to run the programs use the ctest command

BUILD_SHARED_LIBS

  • ON: shared library build

  • OFF (default): static library build

  • Note: for Python wheel, a shared library build is necessary

BUILD_PYTHON

  • ON: build the Python interfaces

  • OFF (default): do not build Python interfaces

  • Note: a shared library build is necessary for Python interfaces

CMAKE_AOCL_ROOT

  • Specify a location for the AOCL libraries, overriding the AOCL_ROOT environment variable

ARCH

  • Sets the -march (Linux) or /arch flag to target a specific Zen generation (e.g. znver5)

16.2. Usage#

16.2.1. Calling the C or C++ APIs#

Your library installation comes with several example programs demonstrating the use of the AOCL-DA C APIs. These are found in the examples folder within your AOCL-DA installation. For C++ users, an additional header file, aoclda_cpp_overloads.hpp, is available in the include folder of your installation, containing APIs which use templates and function overloading to remove the reference to explicit floating-point types.

To use AOCL-DA in your application, compile your code with a C or C++ compiler, including the AOCL-DA header files aoclda.h or aoclda_cpp_overloads.hpp (these are located in the include folder of your installation), then link the objects to the AOCL-DA library, along with its dependences aoclsparse, libflame, libblis, libaoclutils, a fortran runtime and openmp.

The following subsections provide some explicit commands to manually build and link programs calling AOCL-DA on the command line. Note that INT_LIB is either LP64 or ILP64 for 32 and 64 bit integers respectively and if ILP64 libraries are used, the macro AOCLDA_ILP64 needs to be defined in your compilation step.

If you prefer to build using CMake, you may refer to the CMakeLists.txt example in the examples folder of your installation.

16.2.1.1. Calling the Library on Linux#

To compile and link to static AOCL libraries using g++, the following command can be used.

g++ <your_source_code>.cpp -I /<path to aocl-da
headers>/include_<INT_LIB>
/<path to aocl-da>/lib_<INT_LIB>/libaocl-da.a
/<path to amd-sparse>/lib_<INT_LIB>/libaoclsparse.a
/<path to amd-libflame>/lib_<INT_LIB>/libflame.a
/<path to amd-blis>/lib_<INT_LIB>/libblis-mt.a
/<path to libaoclutils>/lib_<INT_LIB>/libaoclutils.a -lgfortran
-lgomp

To compile and link to static AOCL libraries using clang++, the following command can be used.

clang++ <your_source_code>.cpp -I /<path to aocl-da
headers>/include_<INT_LIB>
/<path to aocl-da>/lib_<INT_LIB>/libaocl-da.a
/<path to amd-sparse>/lib_<INT_LIB>/libaoclsparse.a
/<path to amd-libflame>/lib_<INT_LIB>/libflame.a
/<path to amd-blis>/lib_<INT_LIB>/libblis-mt.a
/<path to libaoclutils>/lib_<INT_LIB>/libaoclutils.a -lflang -lomp
-lpgmath

To compile and link to dynamic AOCL libraries using g++, the following command can be used.

g++ <your_source_code>.cpp -I /<path to aocl-da
headers>/include_<INT_LIB>
-L /<path to aocl-da>/lib_<INT_LIB> -L /<path to
amd-sparse>/lib_<INT_LIB>
-L /<path to amd-libflame>/lib_<INT_LIB> -L /<path to
amd-blis>/lib_<INT_LIB>
-L /<path to amd-utils>/lib -laocl-da -laoclsparse -lflame -lblis-mt
-laoclutils -lgfortran -lgomp

To compile and link to dynamic AOCL libraries using clang++, the following command can be used.

clang++ <your_source_code>.cpp -I /<path to aocl-da
headers>/include_<INT_LIB>
-L /<path to aocl-da>/lib_<INT_LIB> -L /<path to
amd-sparse>/lib_<INT_LIB>
-L /<path to amd-libflame>/lib_<INT_LIB> -L /<path to
amd-blis>/lib_<INT_LIB>
-L /<path to amd-utils>/lib -laocl-da -laoclsparse -lflame -lblis-mt
-laoclutils -lflang -lomp -lpgmath

Note that for dynamic linking you will need to update your LD_LIBRARY_PATH environment variable.

If you wish to call AOCL-DA from a C code, then you should compile using your C compiler (e.g. gcc), but link separately, using a C++ linker (e.g. g++).

16.2.1.2. Calling the Library on Windows#

AOCL-DA requires Fortran runtime libraries for linking, so prior to compiling you will need to source the ifort compiler using, e.g., C:\Program Files (x86)\Intel\oneAPI\setvars.bat.

The following command can then be used to compile and link with the cl compiler.

cl <example_name>.cpp /I \<path to aocl-da headers>\include\<INT_LIB>
/EHsc /MD
\<path to>\aocl-da\lib\<INT_LIB>\aocl-da.lib
\<path to>\amd-sparse\lib\<INT_LIB>\aoclsparse.lib
\<path to>\amd-libflame\lib\<INT_LIB>\AOCL-LibFlame-Win-MT-dll.lib
\<path to>\amd-blis\lib\<INT_LIB>\AOCL-LibBlis-Win-MT-dll.lib
\<path to>\amd-utils\lib\libaoclutils.lib /openmp:llvm

The same command should work with cl replaced by clang-cl (in which case simply use /openmp) and linking statically using /MT

16.2.2. Calling the Python APIs#

The AOCL-DA Python package comes with several example scripts. To locate them, the following commands can be run in your Python interpreter.

from aoclda.examples import info

info.examples_path()
info.examples_list()

Alternatively, from your command prompt the command python -m aocl.examples.info will print the same information.

For existing scikit-learn users, you can “patch” your python code to use AOCL-DA where it is available by inserting the following lines prior to your scikit-learn import lines:

from aoclda.sklearn import skpatch, undo_skpatch

skpatch()

Note that Python support on Windows is currently experimental. A Fortran runtime library libifcore-mt.lib is required, so you will need to install the Intel Fortran compiler and set the environment variable FORTRAN_RUNTIME to point to the directory containing the corresponding DLL. You may also need to install an OpenMP runtime and add it to your Windows environment.