8.5.1. Dynamic Dispatch - 5.2 English - 57404

AOCL User Guide (57404)

Document ID
57404
Release Date
2025-12-29
Version
5.2 English

AOCL-DLP automatically selects the best kernel for your CPU based on available instruction sets. However, you can override this behavior for testing or specific optimization scenarios.

Architecture Control:

The AOCL_ENABLE_INSTRUCTIONS environment variable forces specific instruction sets, overriding auto-detection:

# Force AVX512 instructions on Zen4 processors
export AOCL_ENABLE_INSTRUCTIONS=avx512
./your_application

# Use Zen3-optimized kernels
export AOCL_ENABLE_INSTRUCTIONS=zen3
./your_application

Supported Values:

  • zen5: Zen 5 architecture optimizations

  • zen4: Zen 4 architecture optimizations

  • zen3: Zen 3 architecture optimizations

  • zen2: Zen 2 architecture optimizations

  • avx512: AVX-512 instruction set

  • avx2: AVX2 instruction set

Optimization Strategies:

  1. Choose Appropriate Data Types: Use lower precision (bf16, int8) when accuracy permits

  2. Enable Matrix Reordering: Reorder frequently used matrices for better cache performance

  3. Utilize Post-Operations: Fuse operations to reduce memory bandwidth

  4. Minimize Operations: Use matrix reordering beforehand so that DLP has to do fewer operations

  5. Align Memory: Ensure proper memory alignment for vector instructions