The figure below summarizes the final AI Engine graph and floorplan for the PFA-1008 design in the VE2802 device on the VEK280 board. The design uses 9 AI Engine tiles for compute kernels, 14 local tiles for buffering, and two Memory Tiles for shared buffers.
The figure below summarizes the PL resources required to implement the design. The PL requires 50 BRAMs and 4 URAMs to implement the I/O permutes and the DMA source & sink buffers to support the design data flow between DDR4 and the AI Engine array. Timing closure is achieved with sufficient margin. This design illustrates how Versal AIE-ML technology with PL may be crafted together to create a high performance tightly coupled custom data path tailored directly to the algorithm of interest.