Estimate Hardware Throughput without Parallelization

Estimate Hardware Throughput without Parallelization - 2021.2 English

Vitis Unified Software Platform Documentation: Application Acceleration Development (UG1393)

Document ID

UG1393

Release Date

2022-03-29

Version

2021.2 English

The throughput of the kernel without any parallelization can be approximated as:

T_HW = Frequency / Computational Intensity = Frequency * max(V_INPUT, V_OUTPUT) / V_OPS

Frequency is the clock frequency of the kernel. This value is determined by the targeted acceleration platform, or target platform. For instance, the maximum kernel clock on an Alveo U200 Data Center accelerator card is 300 MHz.

As previously mentioned, the Computational Intensity of a function is the ratio of the total number of operations to the total amount of input and output data. The formula above clearly shows that functions with a high volume of operations and a low volume of data are better candidates for acceleration.