Review Profile Summary Report - 2022.2 English

Vitis Tutorials: Hardware Acceleration (XD099)

Document ID
XD099
Release Date
2022-12-01
Version
2022.2 English
  • Kernels & Compute Unit: Kernel Execution indicates that the Total Time by kernel enqueue is about 292 ms.

    missing image

    • 4 words in parallel are computed. The accelerator is architected at 300 MHz. In total, you are computing 350,000,000 words (3,500 words/document * 100,000 documents).

    • Number of words/(Clock Freq * Parallelization factor in kernel) = 350M/(300M*4) = 291.6 ms. The actual FPGA compute time is almost same as your theoretical calculations.

  • Host Data Transfer: Host Transfer shows that the Host Write Transfer to DDR is 145 ms and the Host Read Transfer to DDR is 36 ms. missing image

    • Host Write transfer using a theoretical PCIe bandwidth of 9GB should be 1399 MB/9GBps = 154 ms

    • Host Read transfer using a theoretical PCIe bandwidth of 12GB should be 350 MB/12 GBps = 30 ms

    • Reported number indicates that the PCIe transfers are occurring at the maximum bandwidth

  • Kernels & Compute Unit: Compute Unit Stalls confirms that there are almost no “External Memory Stalls” missing image