This benchmark performs the matrix-vecotr multiplication, M is number of rows of matrix, N is number of columns of matrix
gemv with OpenCL in u280
M | N | Kernel execution time [s] | api execution time [s] | efficiency [%] |
---|---|---|---|---|
512 | 256 | 1.4316e-05 | 0.00330468 | 42.9173 |
512 | 512 | 1.9998e-05 | 0.00337302 | 61.4461 |
1024 | 1024 | 6.5904e-05 | 0.0035207 | 74.5812 |
2048 | 2048 | 0.000235251 | 0.00365028 | 83.5737 |
4096 | 4096 | 0.000939699 | 0.00452506 | 83.6898 |
8192 | 8192 | 0.00332612 | 0.0105467 | 94.5764 |
For more details on this benchmark, see: