Added GEMM 64x64x64 In kernel 0 Added instruction GEMM (64x64 * 64x64) Added GEMM 64x64x64 In kernel 1 Added instruction GEMM (64x64 * 64x64) Added GEMM 64x64x64 In kernel 2 Added instruction GEMM (64x64 * 64x64) Added GEMM 64x64x64 In kernel 3 Added instruction GEMM (64x64 * 64x64) Added GEMM 64x64x64 Found Platform Platform Name: Xilinx INFO: device name is: xilinx_u250_xdma_201830_2 INFO: Importing build_dir.hw.xilinx_u250_xdma_201830_2/blas.xclbin Loading: 'build_dir.hw.xilinx_u250_xdma_201830_2/blas.xclbin' INFO: created kernels loadXclbin 6960.979134 msec create kernels 13.595438 msec create buffers 0.176534 msec INFO: transferred data to kernel 0 INFO: transferred data to kernel 1 INFO: transferred data to kernel 2 INFO: transferred data to kernel 3 copy to kernels 0.884381 msec INFO: Executed kernel 0 INFO: Executed kernel 1 INFO: Executed kernel 2 INFO: Executed kernel 3 call kernels 0.398135 msec INFO: Transferred data from kernel0 INFO: Transferred data from kernel1 INFO: Transferred data from kernel2 INFO: Transferred data from kernel3 copyFromFpga 0.260636 msec total 6976.308826 msec subtotalFpga 1.750123 msec DATA_CSV:,DdrWidth,Freq,M,K,N,Ops,KernelCycles,TimeKernelMs,TimeApiMs,EffKernelPct,EffApiPct,PerfKernelTops,PerfApiTops DATA_CSV:,16,242.000000,64,64,64,2146304,2639,0.010905,1.750123,38.802577,0.241778,0.199516,0.001226 ########### Op Gemm ########### C = postScale(A * B + X) 64x64 = 64x64 * 64x64 + 64 x 64 Comparing ... Compared 4096 values: exact match 1281 within tolerance 2815 mismatch 0 Gemm C Matches pass