The embedded AI Engine system comprises the embedded processor in Versal adaptive SoC and the acceleration logic built into two key categories of acceleration components: the traditional PL (LUTs, BRAMs, URAMs, DSPs) and the AI Engines. For Versal adaptive SoC, the embedded compute system comprises the Arm Cortex-A72 and Cortex-R5F processors. For this design type, the use models can range from a sophisticated embedded software stack to a simple bare-metal stack only required to support programming of the acceleration units.
An embedded AI Engine system design runs a software stack executed on the built-in embedded processor that serves as an overall control plane for the kernels running on the acceleration units. Data transfer is managed between the embedded processors and Versal adaptive SoC by the Xilinx Runtime (XRT) application programming interfaces (APIs). These APIs also have function calls for managing the acceleration units.
While the embedded AI Engine can interface to the PL within a Versal device entirely through hardware streaming interfaces, it can also be targeted for systems that leverage the embedded Arm processing subsystem and use the AI Engine and PL, as shown in the following figure.