On Non-AMD “Zen” Architectures - 5.2 English - 57404

AOCL User Guide (57404)

Document ID
57404
Release Date
2025-12-29
Version
5.2 English

The Dynamic Dispatch feature supports AMD “Zen”, AMD “Zen2”, AMD “Zen3”, AMD “Zen4”, and AMD “Zen5” architectures in a single binary. However, it also includes a generic architecture to support older x86-64 processors. The generic architecture uses a pure C implementation of the APIs and does not use any architecture-specific features.

The specific compiler flags used for building the library with generic configuration are:

-O2 -funsafe-math-optimizations -ffp-contract=fast -Wall \
-Wno-unused-function -Wfatal-errors

Note

As no architecture specific optimization and vectorized kernels are enabled, performance with the generic architecture may be significantly lower than the architecture-specific implementation.

Previous AOCL-BLAS releases identified the processor based on Family, Model, and other cpuid features, and selected the appropriate code path based on the preprogrammed choices. With Dynamic Dispatch, an unknown processor would fall through to the slow generic code path, although users could override this by setting an environment variable BLIS_ARCH_TYPE to a suitable value.

From AOCL-BLAS 4.2, additional cpuid tests based on AVX2 and AVX512 instruction support are used to enable AMD “Zen3”, AMD “Zen4” or AMD “Zen5” code paths to be selected by default on suitable x86-64 processors (i.e. future AMD processors and current or future Intel processors). These AMD Zen code paths are not (re-)optimized specifically for these different architectures but should perform better than the slow generic code path.

To be more specific:

  • AVX2 support requires AVX2 and FMA3.

  • AVX512 support requires AVX512 F, DQ, CD, BW, and VL.