Blas Function Kernels
Blas kernels in this library has a uniform top function. This top function has only two memory interfaces for communicating with external memories via AXI memory controlller. The external memory interface can be DDR, HBM or PLRAM. As shown in the figure below, the top function blasKernel is composed by instruction process unit, timer and operation functional unit e.g. GEMM. The functional unit can be a single blas function or more.
GEMM Kernels
General matrix multiply (GEMM) is a very common and important function in BLAS library. It is a core operation in many applications such as machine learning algorithms. GEMM operation C = A * B + X is implemented as a kernel in this library.