AOCL-Sparse - 5.0 English

AOCL Performance Tuning Guide (63859)

Document ID
63859
Release Date
2024-10-10
Version
5.0 English

4. AOCL-Sparse#

AOCL-Sparse provides multi-thread support for specific APIs through OpenMP by default. Setting the total number of threads :

  • AOCLSPARSE_NUM_THREADS (recommended)

  • OMP_NUM_THREADS

  • If both environment variables are set, AOCLSPARSE_NUM_THREADS has higher preference.

  • If neither variable is set, the default number of threads is equal to the number of processors identified by the OpenMP library.

  • The functions with multi-thread support include SpMV variants, Sp2M, SpMM, SpAdd (with version 5.0) and TRSM.

Tuning and Optimizations:

  • AOCL-Sparse provides hint and optimize framework to accelerate the supported functions (at present SpMV) by a prior matrix analysis based on the users’ hints.

  • AOCL-Sparse can be built with AVX512 code path enabled for certain APIs. However, we recommend using AVX2 code path (enabled by default) for best performance on Genoa.