AMD EPYC 9xx5-Series Processors Compiler Options Quick Reference - 63857

AOCC Quick Reference Guide

Document ID
63857
Release Date
2024-10-10
Revision
5.0 English

AOCC compiler (C/C++/Fortran)

Latest release: 5.0, October 2024

https://www.amd.com/en/developer/aocc.html

Action Command
Architecture
Generate instructions that runs on AMD 5th Gen EPYC™ and AMD 5th Gen Ryzen™
-march=znver5
Generate instructions supported in the given machine
-march=native
Optimization Levels
Disables all optimizations
-O0
Enables minimal level optimizations
-O1
Enables moderate level optimizations (Default from AOCC 4.1)
-O2/-O
Enables all optimizations that attempt to make programs run faster
-O3
Enables O3 with other aggressive optimizations that may violate strict compliance and precisions
-Ofast
Enables link time optimization
-flto
Enables advanced optimizations - improved variants of various scalar, vector, and loop transformations
-zopt
Enables advanced vector transformations
-fvector-transform
-mllvm -enable-strided-vectorization
Enables loop transformations
-floop-transform
Enables advanced loop transformations
-faggressive-loop-transform
Enables memory layout optimizations
-flto -fremap-arrays
-mllvm -reduce-array-computations=3
Enables function level optimizations
-flto -fitodcalls 
-mllvm -function-specialize
-flto -finline-recursion={1..4}
Profile guided optimizations
-fprofile-instr-generate

(1st invocation)

-fprofile-instr-use

(2nd invocation)

Enables use of OpenMP® directives
-fopenmp
Enables streaming stores to optimize memory bandwidth usage
-fnt-store
Other Options
Enables faster, less precise math operations (part of Ofast)
-ffast-math
-freciprocal-math
OpenMP® threads and affinity (N number of cores)
export OMP_NUM_THREADS=N
export GOMP_CPU_AFFINITY="0-{N-1}"
Link to AMD library
-L/libm-install-dir/lib -lamdlibm -lm
Enables vector library
-fveclib=AMDLIBM -lamdlibm -lm
Enables faster library
-Ofast -ffastlib=AMDLIBM -lamdlibmfast -lamdlibm -lm
For Fortran Workloads
Compiles Fortran free form layout
-ffree-form

AMD Optimized Libraries

Latest release: 5.0, October 2024

https://www.amd.com/en/developer/aocl.html

AMD uProf (Performance & Power Profiler)

Latest release: 5.0, October 2024

GNU Compiler Collection

https://www.amd.com/en/developer/uprof.html

Latest release: GCC 14.2, July 2024

Recommended version: GCC 14.1 or later

http://gcc.gnu.org

Action Command
Architecture
Generate instructions that runs on AMD 5th Gen EPYC™ and AMD 5th Gen Ryzen™
-march=znver5
Generate instructions supported in the given machine
-march=native
Optimization Levels
Disables all optimizations (default)
-O0
Enables minimal level optimizations
-O1/ -O
Enables moderate level optimizations
-O2
Enables all optimizations that attempt to make programs run faster
-O3
Enables O3 with other aggressive optimizations that may violate strict compliance and precisions
-Ofast
Additional Optimizations
Enables link time optimizations
-flto
Enables unrolling
-funroll-all-loops
Generates memory preload instructions
-fprefetch-loop-arrays
Enables profile-guided optimizations
-fprofile-generate

(1st invocation)

-fprofile-use

(2nd invocation)

Enables use of OpenMP® directives
-fopenmp
Other Options
Enables compiler to use IEEE FP comparisons
-mieee-fp
Enables faster, less precise math operations
-ffast-math
Compiles Fortran free form layout
-ffree-form
OpenMP® threads and affinity (N number of cores)
export OMP_NUM_THREADS=N
export GOMP_CPU_AFFINITY="0-{N-1}"
Link to AMD library
-L/libm-install-dir/lib -lamdlibm -lm

Microsoft® Visual Studio 2022

Latest release: 17.11.4, September 2024

https://visualstudio.microsoft.com/

User Guide

Action Command
Architecture  
Generate instructions that run on AMD 5th Gen EPYC™ and AMD 5th Gen Ryzen™
/arch:[AVX|AVX2|AVX512]
Optimize for 64-bit AMD processors
/favor:AMD64
Optimization Levels
Disable optimizations
/Od
Maximum optimizations (favor space)
/O1

includes

/Ob2
Maximum optimizations (favor speed)
/O2

includes

/Ob2
Enables inline expansion
/Ob

(0/1/2/3)

[link.exe] Eliminates unreferenced function and/or data
/OPT:REF
[link.exe] Performs identical COMDAT folding
/OPT:ICF
Output an informational message for loops that are auto-vectorized
/Qvec-report:[1|2]
Enables automatic parallelization of loops, used with #pragma loop() directive
/Qpar
Output an informational message for loops that are auto-parallelized
/Qpar-report:[1|2]
Additional Optimizations
Maintain the precision for floating-point operations through proper rounding
/fp:precise
Optimize floating-point code for speed at the expense of floating point accuracy and correctness
/fp:fast
Whole Program Optimization (link-time code generation)
/GL
Enables Profile-guided optimizations
/LTCG /GENPROFILE

(1st invocation)

/LTCG /USERPROFILE

(2nd invocation)

Enables OpenMP® Support
/openmp:experimental
/openmp:llvm

GlibC

Latest release: 2.40, July 2024

Recommendation: 2.38 or later

https://www.gnu.org/software/libc/

Binutils

Latest release: 2.43, August 2024

Recommendation: 2.42 or later

https://www.gnu.org/software/binutils/

Intel® oneAPI DPC++/C++ Compiler

Latest release: 2024.2.1

http://software.intel.com

Action Command
Architecture
Generate instructions that run on AMD 5th Gen EPYC™ and AMD 5th Gen Ryzen™
-axCORE-AVX512
Optimization Levels
Disable all optimizations
-O0
Speed optimization without code growth
-O1
Enables optimization for speed including vectorization
-O2
Enables O2 and aggressive loop transformations
-O3
Enables set of aggressive options to improve speed
-Ofast
Additional Optimizations
Sets function inline level
-inline-level=<value>
Sets maximum number of times to unroll loops
-unroll[=n]
Disable improved precision floating divides
-no-prec-div
Enables vectorization
-vec
Enables inter procedural optimizations (alias for -flto)
-ipo
Enables whole program link time optimization (LTO)
-flto[=arg];
arg: full (default), thin
Enables use of OpenMP® directives
-qopenmp
Enables profile generated optimization
-prof-gen

and

-prof-use
Other Options
Enables floating point accuracy tunings
-fp-model
Compiles Fortran free form layout
-free