The figure below summarizes the AI Engine and PL resources required to implement the design in the VC1902 device on the VCK190 eval board. The design is using 12 AI Engine tiles for compute and 19 tiles for buffering. The PL portion requires ~100 BRAMs to provide the I/O permutes and matrix transpose operations. This design illustrates how Versal AI Engine and PL may be crafted to create a high performance tightly coupled custom data path tailored directly to the algorithm of interest.