DPU-V1 for Cloud - 1.1 English

Vitis AI User Guide (UG1414)

Document ID
Release Date
1.1 English

DPU-V1 (previously known as xDNN) IP cores are high performance general CNN processing engines (PE).

Figure 1. DPU-V1 Architecture

The key features of this engine are:

  • 96x16 DSP Systolic Array operating at 700 MHz
  • Instruction-based programming model for simplicity and flexibility to represent a variety of custom neural network graphs.
  • 9 MB on-chip Tensor Memory composed of UltraRAM
  • Distributed on-chip filter cache
  • Utilizes external DDR memory for storing Filters and Tensor data
  • Pipelined Scale, ReLU, and Pooling Blocks for maximum efficiency
  • Standalone Pooling/Eltwise execution block for parallel processing with Convolution layers
  • Hardware-Assisted Tiling Engine to sub-divide tensors to fit in on-chip Tensor Memory and pipelined instruction scheduling
  • Standard AXI-MM and AXI4-Lite top-level interfaces for simplified system-level integration
  • Optional pipelined RGB tensor Convolution engine for efficiency boost