The resource utilization of a sample DPUCZDX8G single core project is as follows. The data is based on the ZCU102 platform with low RAM usage, depthwise convolution, average pooling, channel augmentation, average pool, leaky ReLU + ReLU6 features, and low DSP usage.
In the following tables, the triplet (PPxICPxOCP) after the architecture refers to the pixel parallelism, input channel parallelism, and output channel parallelism.
DPUCZDX8G Architecture | LUT | Register | Block RAM | DSP |
---|---|---|---|---|
B512 (4x8x8) | 27893 | 35435 | 73.5 | 78 |
B800 (4x10x10) | 30468 | 42773 | 91.5 | 117 |
B1024 (8x8x8) | 34471 | 50763 | 105.5 | 154 |
B1152 (4x12x12) | 33238 | 49040 | 123 | 164 |
B1600 (8x10x10) | 38716 | 63033 | 127.5 | 232 |
B2304 (8x12x12) | 42842 | 73326 | 167 | 326 |
B3136 (8x14x14) | 47667 | 85778 | 210 | 436 |
B4096 (8x16x16) | 53540 | 105008 | 257 | 562 |
Another example of a DPUCZDX8G single core project is based on the ZCU104 platform. In this project, the image and weights buffer utilize UltraRAM. The project is configured with low RAM usage, depthwise convolution, average pooling, channel augmentation, average pool, leaky ReLU + ReLU6 features, and low DSP usage. The resource utilization of this project is as follows.
DPUCZDX8G Architecture | LUT | Register | Block RAM | UltraRAM | DSP |
---|---|---|---|---|---|
B512 (4x8x8) | 27396 | 35251 | 1.5 | 18 | 78 |
B800 (4x10x10) | 30356 | 42463 | 1.5 | 40 | 117 |
B1024 (8x8x8) | 34134 | 50820 | 1.5 | 26 | 154 |
B1152 (4x12x12) | 33103 | 49502 | 2 | 44 | 164 |
B1600 (8x10x10) | 38526 | 63294 | 1.5 | 56 | 232 |
B2304 (8x12x12) | 42538 | 74000 | 2 | 60 | 326 |
B3136 (8x14x14) | 47270 | 85782 | 2 | 64 | 436 |
B4096 (8x16x16) | 52681 | 104562 | 2 | 68 | 562 |