The AOCL Dynamic feature enables AOCL-BLAS to dynamically change the number of threads. Based on the input parameters (such as size, transpose, and storage format), the optimal code path for the given number of threads would be executed. This can be single-threaded even if the number of threads set is more than 1.
This feature is enabled by default, however, it can be enabled or
disabled at the configuration time using the configure options
--enable-aocl-dynamic and --disable-aocl-dynamic respectively,
or the CMake options -DENABLE_AOCL_DYNAMIC=ON and -DENABLE_AOCL_DYNAMIC=OFF respectively.
You can also specify the preferred number of threads using the
environment variables BLIS_NUM_THREADS or OMP_NUM_THREADS.
If both are specified, BLIS_NUM_THREADS takes precedence.
The following table summarizes how the number of threads is
determined based on the status of AOCL Dynamic and the user
configuration using the variable BLIS_NUM_THREADS:
AOCL Dynamic |
BLIS_NUM_THREADS |
Number of Threads Used by AOCL-BLAS |
|---|---|---|
Disabled |
Unset |
Number of logical cores. |
Disabled |
Set |
|
Enabled |
Unset |
Number of threads determined by AOCL Dynamic. |
Enabled |
Set |
Minimum of |
The AOCL Dynamic feature has the following limitations:
Supported only for threading using OpenMP.
Supports only
DGEMM,ZGEMM,CGEMM,DTRSM,ZTRSM,DGEMMT,ZGEMMT,DSYRK,DTRMM,DGEMV,SGEMV,DSCAL,ZDSCAL,DDOT,DNRM2,DZNRM2,DAXPY, andDCOPYAPIs.