The hardware resource utilizations are listed in the following table. Different tool versions may result slightly different resource.
Kernel | BRAM | URAM | DSP | FF | LUT | Frequency(MHz) |
kernel1 | 72 | 10 | 410 | 56498 | 48301 | 250 |
kernel2 | 11 | 0 | 5 | 23073 | 16375 | 250 |
- One instance achieves about 6~14 times acceleration. Here are some examples:
Kernel | Width (pix) | Height (pix) | -q | latency (ms) | Throughput FPGA B (Mb/s) | Throughput FPGA P (Mp/s) | FPs (fps) |
Kernel1 | 1920 | 1080 | 80 | 21.18 | 146.83 | 97.88 | 47.20 |
Kernel2 | 1920 | 1080 | 80 | 14.57 | 213.54 | 142.36 | 68.65 |
Kernel1 | 512 | 512 | 80 | 3.22 | 122.03 | 81.35 | 310.33 |
Kernel2 | 512 | 512 | 80 | 2.92 | 134.65 | 89.77 | 342.43 |
Kernel1 | 1920 | 1080 | 90 | 21.03 | 147.87 | 98.58 | 47.54 |
Kernel2 | 1920 | 1080 | 90 | 15.92 | 195.43 | 130.29 | 62.83 |
Kernel1 | 512 | 512 | 90 | 4.73 | 83.12 | 55.41 | 211.39 |
Kernel2 | 512 | 512 | 90 | 4.93 | 79.73 | 53.16 | 202.78 |
- Platform: FPGA U200, CPU details are listd belowd (single thread)
Note
1. Kernels running on platform with Intel(R) Xeon(R) CPU E5-2603 v3 @ 1.60GHz, 48 Threads.
2. time unit: ms.
3. “-” Indicates that the result could not be obtained due to insufficient memory.
4. FPGA time is the kernel runtime by adding data transfer and executed with webp encoder.