4.4.5. Performance Suggestions for Skinny Matrices - 5.2 English - 57404

AOCL User Guide (57404)

Document ID
57404
Release Date
2025-12-29
Version
5.2 English

AOCL-BLAS provides a selective packing for GEMM when one or two-dimensions of a matrix is exceedingly small. Selective packing is only applicable when sup is enabled. For optimal performance:

# C = beta*C + alpha*A*B
# Dimension (Dim) of A - m x k
# Dimension (Dim) of B - k x n
# Dimension (Dim) of C - m x n
# Assume all are stored in row-major format.

# IF m >> n
$ BLIS_PACK_A=1 ./test_gemm_blis.x - will give a better performance.

# IF m << n
$ BLIS_PACK_B=1 ./test_gemm_blis.x - will give a better performance.