xfblasStatus_t xfblasGemm(xfblasOperation_t transa, xfblasOperation_t transb, int m, int n, int k, int alpha, void* A, int lda, void* B, int ldb, int beta, void* C, int ldc, unsigned int kernelIndex = 0, unsigned int deviceIndex = 0)
This function performs the matrix-matrix multiplication C = alpha*op(A)op(B) + beta*C. For detailed usage, see the L3 examples.
Parameters:
transa | Operation op(A) that is non- or (conj.) transpose. |
transb | Operation op(B) that is non- or (conj.) transpose. |
m | Number of rows in matrix A, matrix C. |
n | Number of cols in matrix B, matrix C. |
k | Number of cols in matrix A, number of rows in matrix B. |
alpha | Scalar used for multiplication. |
A | Pointer to matrix A in the host memory. |
lda | Leading dimension of matrix A. |
B | Pointer to matrix B in the host memory. |
ldb | Leading dimension of matrix B. |
beta | Scalar used for multiplication. |
C | Pointer to matrix C in the host memory. |
ldc | Leading dimension of matrix C. |
kernelIndex | Index of the kernel that is being used; default is 0. |
deviceIndex | Index of the device that is being used; default is 0. |
Return:
xfblasStatus_t | 0 if the operation completed successfully. |
xfblasStatus_t | 1 if the library was not initialized. |
xfblasStatus_t | 3 if not all the matrices have FPGA devie memory allocated. |
xfblasStatus_t | 4 if the engine is not supported for now. |