For select functions, AOCL-LAPACK supports automatic processor
dispatching to suitable code paths based on the target CPU ISA
architecture. However, you can enable different ISA code path using
environment variable, AOCL_ENABLE_INSTRUCTIONS. Valid values for
AOCL_ENABLE_INSTRUCTIONS are SSE2, AVX, AVX2, AVX512 and GENERIC. All
values are case-insensitive.
When you set AOCL_ENABLE_INSTRUCTIONS to an ISA value higher than that
supported by a target CPU, AOCL-LAPACK chooses the code path that uses ISA
supported by the CPU and which is also optimal for that CPU. If you choose a
lower level ISA, then the code path using the lower level ISA will be chosen.
Any ISA selection lower than AVX2 defaults to the generic reference code path.
Case 1: On an AVX2-only machine (example: AMD Zen1 / Zen2 / Zen3)
Setting
AOCL_ENABLE_INSTRUCTIONS=AVX2will take avx2 path.Setting
AOCL_ENABLE_INSTRUCTIONS=AVX512will take avx2 pathSetting
AOCL_ENABLE_INSTRUCTIONS=genericorsse2oravxwill take reference path.
Case 2: On an AVX512 machine (example: Zen4 / Zen5)
Setting
AOCL_ENABLE_INSTRUCTIONS=AVX512will take avx512 pathSetting
AOCL_ENABLE_INSTRUCTIONS=AVX2will take avx2 pathSetting
AOCL_ENABLE_INSTRUCTIONS=genericorsse2oravxwill run reference path.
Case 3: Setting AOCL_ENABLE_INSTRUCTIONS to values other than
avx512, avx2, avx, sse2, or generic will result in an error
Please note that enabling AMD flags for compilation sets the minimum ISA
requirement to be AVX2 (see previous section). Hence, in these cases SSE2
and AVX options for AOCL_ENABLE_INSTRUCTIONS are not meaningful.
They are applicable if the library was built without -mavx2 option.