The DPUCVDX8G is a high-performance general convolutional neural network (CNN) processing engine optimized for Versal ACAP devices. This IP is user-configurable and exposes several parameters which can be specified to configure a number of used AI Engine cores and PL resources or customize features. The DPUCVDX8G is composed of both AI Engines and PL.
The AI Engines in the DPUCVDX8G perform convolution. The AI Engine interface tiles transfer data between the AI Engine and the PL. For high-performance calculation in some Versal devices, the AI Engine groups are composed of multiple adjacent AI Engines. For a multi-batch DPUCVDX8G architecture, each batch handler has a private AI Engine group.
The PL component includes a high-level scheduler module, a global memory for shared weights, and batch handlers for Load, Save, Depth-Wise, Pool, and Elt-wise. The scheduler and weights buffer are shared logic between all DPUCVDX8G batch handlers. The Load and Save Module, Depth-Wise, Pooling, and Elt-wise Module, and local feature map storage are private for each batch handler.
The top-level block diagram of DPUCVDX8G is shown in the following figure.