The following environment variable settings are optimal settings for zentorch, and should be used in addition to the environment variable settings.
- CNN-based models
- FP32 models
- ZENDNN_MATMUL_ALGO=FP32:4
- BF16 (AMP) models
- ZENDNN_MATMUL_ALGO=BF16:4
- FP32 models
- NLP-based models
- FP32 models
- ZENDNN_MATMUL_ALGO=FP32:2
- BF16 (AMP) models
- ZENDNN_MATMUL_ALGO=BF16:4
- FP32 models
- LLM-based models
- BF16 and WOQ (Per channel and Per group) models
- ZENDNN_MATMUL_ALGO=BF16:0
- BF16 and WOQ (Per channel and Per group) models
- For RecSys models
- FP32, INT8 and BF16 models
- ZENDNN_MATMUL_ALGO=FP32:2,INT8:2,BF16:2
- BF16 (AMP) models
- ZENDNN_MATMUL_ALGO=BF16:4
- FP32, INT8 and BF16 models