The following environment variable settings are common to both frameworks.
-
ZENDNN_LOG_OPTS=ALL:0 -
OMP_NUM_THREADS=128 # For a system with 128 cores per socket -
OMP_WAIT_POLICY=ACTIVE -
OMP_DYNAMIC=FALSE -
ZENDNN_MATMUL_ALGO=FP32:0,BF16:0 -
ZENDNN_PRIMITIVE_CACHE_CAPACITY=1024
The environment variables OMP_NUM_THREADS,
OMP_WAIT_POLICY, and OMP_PROC_BIND, can be used to tune performance of both frameworks. These
are OpenMP variables. Refer to the OpenMP documentation for details.
For achieving the best performance in zentorch, use KMP variables, refer Table 2 for further details.
For achieving the best performance in zentf, use GOMP_CPU_AFFINITY=0-127. # For
a system with 128 cores per socket.
For optimal performance, the Batch Size must be a multiple of the total number of cores (used by the threads).
Thread Wait Policy
OMP_WAIT_POLICY environment variable provides
options to the OpenMP runtime library based on the expected behavior of the waiting
threads. It can take the abstract values PASSIVE and
ACTIVE. The default value is PASSIVE. When OMP_WAIT_POLICY
is set to PASSIVE, the waiting threads will be
passive and will not consume the processor cycles. Whereas, setting it to ACTIVE will consume processor cycles.
OMP_WAIT_POLICY to
ACTIVE may give better performance.