Vitis Libraries

Release Date
2023.2 English

This section provides the L2 performance benchmarks and QoR (Quality of Results) for AIE DSP library elements with various configurations. The results are extracted from hardware emulation based simulations. The device used for AIE benchmarking is the xcvc1902-vsva2197-1MP-e-S which has a 1GHz clock, and the device used for AIE-ML is the xcve2802-vsvh1760-1MP-e-S-es1 which has a 1.15GHz clock. Some devices and speed grades available clock the array at other frequencies (e.g. 1.25GHz for the xcvc1902-vsva2197-2MP-e-S ). Since the library elements measured here are contained entirely within the AIE array and subject to one clock alone, it is fair to scale throughput figures seen here by the ratio of clock speeds to get the throughput figures for devices where the AIE array is clocked at a different frequency.

The metrics reported for each case are:

  • Latency - the time delay between the first input sample and the first output sample. If there are multiple ports, the latency is recorded from the first input and first output port
  • Throughput - input throughput calculated based on the number of samples per iteration and the time between each consecutive iteration
  • NUM_BANKS - number of memory banks used by the design
  • NUM_AIE - number of AIE tiles used by the design
  • DATA_MEMORY - total data memory in Bytes used by the design
  • PROGRAM_MEMORY - program memory in Bytes used by each kernel

The parameter, AIE_VARIANT, refers to the type of AI Engine that is used for each particular case in the benchmark results. A value of 1 denotes the AIE, and a value of 2 denotes the AIE-ML.

The PROGRAM_MEMORY metrics are harvested for each kernel the design consists of. For example a FIR configured to be implemented on two tiles (CASC_LEN=2) will have two sets of figures displayed in the table below (space delimited).