The following table lists the resource utilization for GEMV-based CG kernel with 16 HBM channels storing the matrix.
Table 143 Resource Utilization on U50
Name |
LUT |
LUTAsMem |
REG |
BRAM |
URAM |
DSP |
User Budget |
699619 [100.00%] |
369603 [100.00%] |
1447189 [100.00%] |
1112 [100.00%] |
640 [100.00%] |
5936 [100.00%] |
Used Resources |
186448 [ 26.65%] |
17334 [ 4.69%] |
325149 [ 22.47%] |
128 [ 11.51%] |
0 [ 0.00%] |
1262 [ 21.26%] |
Table 144 Benchmark Results on U50
Vector Size |
Time per Iteration [ms] |
U50 Performance [GFLOPS] |
U50 Energy Efficiency [GFLOPS/W] |
CPU Performance [GFLOPS] |
Acceleration Ratio |
1024 |
0.073 |
26.938 |
0.723 |
12.996 |
2.073 |
2048 |
0.2557 |
30.658 |
0.766 |
27.469 |
1.116 |
4096 |
0.9202 |
34.018 |
0.812 |
7.776 |
4.375 |
8192 |
3.405 |
36.742 |
0.839 |
8.226 |
4.467 |