Resource Utilization - 2024.2 English - XD100

Vitis Tutorials: AI Engine Development (XD100)

Document ID
XD100
Release Date
2024-12-06
Version
2024.2 English

The AI Engine reduces the overall requirement on the PL and DSPs in a design with a lot of vectorizable compute. For example, the following shows the required resources for the same 64-Tap FIR filter implemented in both AI Engine and PL with DSPs:

Impl Filters Taps Param Throughput LUTS Flops DSP AIE
AIE 1 64 win=2048 512.480 MSPS 189 568 0 2
HLS 1 64 ck_per_sam=1 497.364 MSPS 1888 5634 64 0
AIE 10 64 win=2048 5124.80 MSPS 189 568 0 20
HLS 10 64 ck_per_sam=1 4781.55 MSPS 10532 45009 640 0
AIE 1 240 win=2048 116.92 MSPS 190 572 0 1
HLS 1 240 ck_per_sam=4 124.845 MSPS 2528 7217 60 0
AIE 10 240 win=2048 1169.28 MSPS 190 572 0 10
HLS 10 240 ck_per_sam=4 1235.07 MSPS 16906 60872 600 0

It is clear that the AI Engine implementation offers significant savings of PL resources, especially as the design size increases. Note: For the 240 tap FIR filter, the DSP version is processing one sample every four clock cycles. This reduces the throughput, but also proportionately reduces the logic and power. If ck_per_sam are set to one, the result provides four times the resources, but also utilizes four times the resources and power, leading to an infeasible design from a resources point of view. In any design, targeting any architecture or technology, trade-offs exist and requires understanding to get the most efficient solution for your requirements.

Power Utilization