xfblasStatus_t xfblasGemm(xfblasOperation_t transa, xfblasOperation_t transb, int m, int n, int k, int alpha, void* A, int lda, void* B, int ldb, int beta, void* C, int ldc, unsigned int kernelIndex = 0, unsigned int deviceIndex = 0)
This function performs the matrix-matrix multiplication C = alpha*op(A)op(B) + beta*C. See L3 examples for detail usage.
Parameters:
transa | operation op(A) that is non- or (conj.) transpose |
transb | operation op(B) that is non- or (conj.) transpose |
m | number of rows in matrix A, matrix C |
n | number of cols in matrix B, matrix C |
k | number of cols in matrix A, number of rows in matrix B |
alpha | scalar used for multiplication |
A | pointer to matrix A in the host memory |
lda | leading dimension of matrix A |
B | pointer to matrix B in the host memory |
ldb | leading dimension of matrix B |
beta | scalar used for multiplication |
C | pointer to matrix C in the host memory |
ldc | leading dimension of matrix C |
kernelIndex | index of kernel that is being used, default is 0 |
deviceIndex | index of device that is being used, default is 0 |
Return:
xfblasStatus_t | 0 if the operation completed successfully |
xfblasStatus_t | 1 if the library was not initialized |
xfblasStatus_t | 3 if not all the matrices have FPGA devie memory allocated |
xfblasStatus_t | 4 if the engine is not supported for now |