17.4. Performance Tuning - 5.2 English - 57404

AOCL User Guide (57404)

Document ID
57404
Release Date
2025-12-29
Version
5.2 English

AOCL-Sparse provides multi-thread support for specific APIs through OpenMP by default. Setting the total number of threads is done using:

  • AOCLSPARSE_NUM_THREADS (recommended)

  • OMP_NUM_THREADS

  • If both environment variables are set, AOCLSPARSE_NUM_THREADS has higher preference.

  • If neither variable is set, the default number of threads is equal to the number of processors identified by the OpenMP library.

  • The functions with multi-thread support include SpMV variants, Sp2M, SpMM, SpAdd, CSRMM and TRSM.

The following tuning and optimization options are available to help improve performance for supported functions:

  • AOCL-Sparse provides hint and optimize framework to accelerate the supported functions (at present SpMV and TRSV) by a prior matrix analysis based on the users’ hints.

  • AOCL-Sparse can be built with either the AVX2 or AVX512 code paths enabled for various APIs. We recommend using AVX512 code path, which is enabled by default, for the best performance on Turin.