8.7.4. Common Scenarios - 5.2 English - 57404

AOCL User Guide (57404)

Document ID
57404
Release Date
2025-12-29
Version
5.2 English

Q: What precision formats are supported?

A: AOCL-DLP supports float32, bfloat16, int8, uint8, and int32 formats with mixed-precision capabilities.

Q: How do I optimize performance for my specific workload?

A: Consider matrix reordering for reused matrices, choose appropriate precision formats, tune thread counts, and utilize post-operations to fuse computations.

Q: Can I use AOCL-DLP with other BLAS libraries?

A: Yes, AOCL-DLP can complement other BLAS libraries, particularly for specialized low-precision and fused operations.

Q: What AMD processors are supported?

A: AOCL-DLP is optimized for AMD processors with AVX2, AVX512, AVX512_VNNI, and AVX512_BF16 instruction sets.

Performance Issues:

Poor performance despite setting threading variables:

  • Verify thread affinity with OMP_PROC_BIND=true and OMP_PLACES=cores

  • Check for thread over-subscription to ensure that the number of threads is not greater than the number of physical cores

  • Monitor CPU utilization to ensure threads are active

Inconsistent performance across runs:

  • Set OMP_WAIT_POLICY=active to keep threads spinning

  • Ensure consistent thread affinity with OMP_PROC_BIND=close

  • Disable CPU frequency scaling during benchmarks

Library not respecting DLP_NUM_THREADS:

  • Ensure there are no conflicting DLP_IC_NT/DLP_JC_NT settings

  • Check that OpenMP is enabled in the library build

  • Verify the variable is set in the correct shell environment

Integration Issues:

Poor performance with static library:

  • Ensure --whole-archive flag is used during linking

  • Verify JIT kernels are registered: nm my_app | grep -i "jit.*register"

  • Run with logging enabled: AOCL_ENABLE_LPGEMM_LOGGER=1 ./my_app

find_package(AoclDlp) not found:

  • Specify installation directory: cmake -DCMAKE_PREFIX_PATH=/path/to/aocl-dlp/install ..

  • Check CMake config files exist: ls /usr/local/lib/cmake/AoclDlp/

Undefined reference errors:

  • Ensure library is properly linked: target_link_libraries(my_app PRIVATE AoclDlp::aocl-dlp)

  • For static linking, add C++ standard library: -lstdc++

Runtime library not found:

  • Add to LD_LIBRARY_PATH: export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH

  • Or use static linking to avoid runtime dependency

For comprehensive troubleshooting information and advanced tuning strategies, refer to the Environment Configuration Guide at

amd/aocl-dlp