The ZCU104 evaluation board uses the mid-range ZU7ev UltraScale+ device. Dual B4096F DPU cores are implemented in program logic and delivers 2.4 TOPS INT8 peak performance for deep learning inference acceleration.
Refer to the following table for the throughput performance (in frames/sec or fps) for various neural network samples on ZCU104 with DPU running at 300 MHz.
No | Neural Network | Input Size | GOPS | Performance (fps) (Single thread) | Performance (fps) (Multiple thread) |
---|---|---|---|---|---|
1 | inception_resnet_v2_tf | 299x299 | 26.4 | 25.4 | 46.1 |
2 | inception_v1_tf | 224x224 | 3.0 | 196.1 | 401.7 |
3 | inception_v3_tf | 299x299 | 11.5 | 60.3 | 116.4 |
4 | inception_v4_2016_09_09_tf | 299x299 | 24.6 | 30.2 | 58.6 |
5 | mobilenet_v1_0_25_128_tf | 128x128 | 0.027 | 1263.6 | 3957.7 |
6 | mobilenet_v1_0_5_160_tf | 160x160 | 0.15 | 763.1 | 2038.1 |
7 | mobilenet_v1_1_0_224_tf | 224x224 | 1.1 | 311.8 | 731.1 |
8 | mobilenet_v2_1_0_224_tf | 224x224 | 0.60 | 250.9 | 546.6 |
9 | mobilenet_v2_1_4_224_tf | 224x224 | 1.2 | 185.5 | 381.5 |
10 | resnet_v1_101_tf | 224x224 | 14.4 | 47.3 | 87.2 |
11 | resnet_v1_152_tf | 224x224 | 21.8 | 32.5 | 60.1 |
12 | resnet_v1_50_tf | 224x224 | 7.0 | 86.3 | 157.5 |
13 | vgg_16_tf | 224x224 | 31.0 | 21.3 | 38.3 |
14 | vgg_19_tf | 224x224 | 39.3 | 18.4 | 33.8 |
15 | ssd_mobilenet_v1_coco_tf | 300x300 | 2.5 | 92.4 | 330 |
16 | ssd_mobilenet_v2_coco_tf | 300x300 | 3.8 | 66.7 | 185 |
17 | ssd_resnet_50_fpn_coco_tf | 640x640 | 178.4 | 1.3 | 5.1 |
18 | yolov3_voc_tf | 416x416 | 65.6 | 14.1 | 29.3 |
19 | mlperf_ssd_resnet34_tf | 1200x1200 | 433 | 1.6 | 5.3 |
20 | resnet50 | 224x224 | 7.7 | 80.2 | 146.8 |
21 | resnet18 | 224x224 | 3.7 | 196.7 | 403.7 |
22 | inception_v1 | 224x224 | 3.2 | 189.8 | 387 |
23 | inception_v2 | 224x224 | 4.0 | 152.7 | 298.2 |
24 | inception_v3 | 299x299 | 11.4 | 60.5 | 117.3 |
25 | inception_v4 | 299x299 | 24.5 | 30.2 | 58.6 |
26 | mobilenet_v2 | 224x224 | 0.6 | 249.3 | 536.6 |
27 | squeezenet | 227x227 | 0.76 | 274.4 | 941.8 |
28 | ssd_pedestrain_pruned_0_97 | 360x360 | 5.9 | 78.1 | 221.5 |
29 | ssd_traffic_pruned_0_9 | 360x480 | 11.6 | 56.1 | 153.2 |
30 | ssd_adas_pruned_0_95 | 360x480 | 6.3 | 84.7 | 231.9 |
31 | ssd_mobilenet_v2 | 360x480 | 6.6 | 25.4 | 101.3 |
32 | refinedet_pruned_0_8 | 360x480 | 25 | 32.6 | 76.1 |
33 | refinedet_pruned_0_92 | 360x480 | 10.1 | 61.3 | 154.2 |
34 | refinedet_pruned_0_96 | 360x480 | 5.1 | 83.6 | 228.7 |
35 | vpgnet_pruned_0_99 | 480x640 | 2.5 | 105.5 | 364.9 |
36 | fpn | 256x512 | 8.9 | 62 | 169.9 |
37 | sp_net | 128x224 | 0.55 | 552.4 | 1245.6 |
38 | openpose_pruned_0_3 | 368x368 | 49.9 | 3.6 | 11 |
39 | densebox_320_320 | 320x320 | 0.49 | 397.4 | 1250.3 |
40 | densebox_640_360 | 360x640 | 1.1 | 198.7 | 606.6 |
41 | face_landmark | 96x72 | 0.14 | 890.1 | 1363.2 |
42 | reid | 80x160 | 0.95 | 385.6 | 668.8 |
43 | multi_task | 288x512 | 14.8 | 36 | 108.4 |
44 | yolov3_adas_pruned_0_9 | 256x512 | 5.5 | 84.5 | 218.5 |
45 | yolov3_voc | 416x416 | 65.4 | 14.2 | 29.5 |
46 | yolov3_bdd | 288x512 | 53.7 | 13.6 | 28.7 |
47 | yolov2_voc | 448x448 | 34 | 26.2 | 59.1 |
48 | yolov2_voc_pruned_0_66 | 448x448 | 11.6 | 55.6 | 153.2 |
49 | yolov2_voc_pruned_0_71 | 448x448 | 9.9 | 62.4 | 180.2 |
50 | yolov2_voc_pruned_0_77 | 448x448 | 7.8 | 70.5 | 217.4 |