During the application mapping phase, you develop a mapping of the core application and of each algorithm to the most appropriate architectural area (e.g., PS, PL, NoC, DDRMC, AI Engine) in the Versal ACAP. This consists of mapping all of the major blocks in the application and considering requirements on these major blocks in terms of bandwidth and availability. This application mapping and design partition step is manual.
You must consider which architecture is best for which task as follows:
- Scalar processing elements (e.g., CPUs) are very efficient at complex algorithms with diverse decision trees and a broad set of libraries—but are limited in performance scaling. Application control code is well suited to run on the Scalar processing elements.
- Vector processing elements (e.g., DSPs) are more efficient at a narrower set of parallelizable compute functions—but they experience latency and efficiency penalties because of inflexible memory hierarchy.
- Programmable logic (e.g., FPGAs) can be precisely customized to a particular compute function, which makes them best at latency-critical real-time applications (e.g., automotive driver assist) and irregular data structures (e.g., genomic sequencing)—but algorithmic changes have traditionally been more time consuming compared to other processing elements.
- The AI Engine cores reduce compute-intensive power consumption by 50% versus the same functions implemented in PL and also provide deterministic, high-performance, real-time DSP capabilities. Because the AI Engine kernels can be written in C/C++, this approach also delivers greater designer productivity.
Partitioning a large, complex design onto a Versal ACAP needs to be approached from the system level, which is a top-down approach. In particular, you must determine how to best use the Intelligent Engines (both DSP and AI Engines), Adaptable Engines (programmable logic), and Scalar Engines ( Arm® Cortex®-A72 and Cortex-R5F processors).
It is important to look at the overall system requirements when making decisions on how to partition your design. Following are key components to consider:
- Overall system compute needs and data types in the dataflow
- Memory and data movement into and around the Versal device
- Throughput and latency for the system
- Power requirements for the complete system