The VCK5000 Versal development card is built on the Xilinx 7nm Versal ACAP architecture and is designed for designs requiring high throughput AI inference and signal processing compute performance. For this release, DPU core with batch=6 and batch=8 are implemented using AI Engines.
VCK5000 Performance with 6PE350 MHz DPUCVDX8H-DWC
The following table lists the throughput performance (in frames/sec or fps) for various neural network samples on the Versal ACAP VCK5000 Gen3x16 with DPUCVDX8H-DWC running at 6PE@350 MHz.
No | Neural Network | Input Size | GOPS | DPU Frequency (MHz) | Performance (fps) (Multiple thread) |
---|---|---|---|---|---|
1 | densebox_320_320 | 320x320 | 0.49 | 350 | 4581.78 |
2 | densebox_640_360 | 360x640 | 1.1 | 350 | 2339.43 |
3 | drunet_pt | 528x608 | 2.59 | 350 | 153.403 |
4 | efficientNet-edgetpu-L_tf | 300x300 | 19.36 | 350 | 427.568 |
5 | efficientNet-edgetpu-M_tf | 240x240 | 7.34 | 350 | 1100.45 |
6 | efficientNet-edgetpu-S_tf | 224x224 | 4.72 | 350 | 1858.11 |
7 | ENet_cityscapes_pt | 512x1024 | 8.6 | 350 | 141.943 |
8 | face_landmark | 96x72 | 0.14 | 350 | 22446.8 |
9 | face-quality | 80x60 | 0.06 | 350 | 31512.7 |
10 | face-quality_pt | 80x60 | 0.06 | 350 | 32034.8 |
11 | facerec_resnet20 | 112x96 | 3.5 | 350 | 4498.68 |
12 | facerec-resnet20_mixed_pt | 112x96 | 3.5 | 350 | 4496.6 |
13 | facerec_resnet64 | 112x96 | 11 | 350 | 2280.95 |
14 | facereid-large_pt | 96x96 | 0.5 | 350 | 21110 |
15 | facereid-small_pt | 80x80 | 0.09 | 350 | 33214.2 |
16 | FairMot_pt | 640x480 | 36 | 350 | 427.839 |
17 | fpn | 256x512 | 8.9 | 350 | 933.435 |
18 | FPN_Res18_Medical_segmentation | 320x320 | 45.3 | 350 | 426.927 |
19 | FPN-resnet18_covid19-seg_pt | 352x352 | 22.7 | 350 | 960.401 |
20 | inception_resnet_v2_tf | 299x299 | 26.4 | 350 | 492.177 |
21 | inception_v1 | 224x224 | 3.2 | 350 | 3293.69 |
22 | inception_v1_tf | 224x224 | 3 | 350 | 3532.76 |
23 | inception_v2 | 224x224 | 4 | 350 | 2613.47 |
24 | inception_v2_tf | 224x224 | 3.88 | 350 | 483.27 |
25 | inception_v3 | 299x299 | 11.4 | 350 | 912.40 |
26 | inception_v3_pt | 299x299 | 5.7 | 350 | 910.16 |
27 | inception_v3_tf | 299x299 | 11.5 | 350 | 913.67 |
28 | inception_v3_tf2 | 299x299 | 11.5 | 350 | 962.69 |
29 | inception_v4 | 299x299 | 24.5 | 350 | 506.92 |
30 | inception_v4_2016_09_09_tf | 299x299 | 24.6 | 350 | 506.77 |
31 | medical_seg_cell_tf2 | 128x128 | 5.3 | 350 | 1358.26 |
31 | medical_seg_cell_tf2 | 128x128 | 5.3 | 350 | 1358.26 |
32 | MLPerf_resnet50_v1.5_tf | 224x224 | 8.19 | 350 | 3406.26 |
33 | mlperf_ssd_resnet34_tf | 1200x1200 | 433 | 350 | 72.2052 |
34 | mobilenet_1_0_224_tf2 | 224x224 | 1.1 | 350 | 5222.53 |
35 | mobilenet_edge_0_75_tf | 224x224 | 0.62 | 350 | 4813.85 |
36 | mobilenet_edge_1_0_tf | 224x224 | 0.99 | 350 | 4298.57 |
37 | mobilenet_v1_0_25_128_tf | 128x128 | 0.027 | 350 | 22781.30 |
38 | mobilenet_v1_0_5_160_tf | 160x160 | 0.15 | 350 | 11945.00 |
39 | mobilenet_v1_1_0_224_tf | 224x224 | 1.1 | 350 | 5224.42 |
40 | mobilenet_v2 | 224x224 | 0.6 | 350 | 3752.71 |
41 | mobilenet_v2_1_0_224_tf | 224x224 | 0.6 | 350 | 3638.07 |
42 | mobilenet_v2_1_4_224_tf | 224x224 | 1.2 | 350 | 2842.98 |
43 | multi_task | 288x512 | 14.8 | 350 | 660.98 |
44 | ofa_depthwise_res50_pt | 160x160 | 0.9 | 350 | 3378.06 |
45 | ofa_resnet50_0_9B_pt | 176x176 | 1.246 | 350 | 4050.59 |
46 | openpose_pruned_0_3 | 368x368 | 49.9 | 350 | 132.81 |
49 | person-orientation_pruned_558m_pt | 176x80 | 0.558 | 350 | 9541.32 |
47 | personreid-res18_pt | 176x80 | 1.1 | 350 | 8172.30 |
48 | personreid-res50_pt | 256x128 | 5.4 | 350 | 3750.77 |
50 | plate_detection | 320x320 | 0.49 | 350 | 6612.75 |
51 | plate_num | 96x288 | 1.75 | 350 | 2759.17 |
52 | pmg_pt | 224x224 | 2.28 | 350 | 3477.35 |
53 | refinedet_baseline | 480x360 | 123 | 350 | 234.10 |
54 | RefineDet-Medical_EDD_tf | 320x320 | 9.8 | 350 | 1000.00 |
55 | refinedet_pruned_0_8 | 360x480 | 25 | 350 | 513.16 |
56 | refinedet_pruned_0_92 | 360x480 | 10.1 | 350 | 649.21 |
57 | refinedet_pruned_0_96 | 360x480 | 5.1 | 350 | 687.19 |
58 | refinedet_VOC_tf | 320x320 | 81.9 | 350 | 307.83 |
59 | reid | 80x160 | 0.95 | 350 | 8422.35 |
60 | resnet18 | 224x224 | 3.7 | 350 | 5185.26 |
61 | resnet50 | 224x224 | 7.7 | 350 | 3738.35 |
62 | resnet50_pt | 224x224 | 4.1 | 350 | 3429.55 |
63 | resnet50_tf2 | 224x224 | 7.7 | 350 | 3737.98 |
64 | resnet_v1_101_tf | 224x224 | 14.4 | 350 | 2244.88 |
65 | resnet_v1_152_tf | 224x224 | 21.8 | 350 | 1596.44 |
66 | resnet_v1_50_tf | 224x224 | 7 | 350 | 3739.72 |
67 | retinaface | 360x640 | 1.11 | 350 | 1627.01 |
68 | salsanext_pt | 64x2048 | 20.4 | 350 | 154.52 |
69 | salsanext_v2_pt | 64x2048 | 32 | 350 | 90.88 |
70 | SemanticFPN_cityscapes_pt | 256x512 | 10 | 350 | 1094.60 |
71 | SemanticFPN_Mobilenetv2_pt | 512x1024 | 5.4 | 350 | 216.55 |
72 | semantic_seg_citys_tf2 | 512x1024 | 54 | 350 | 115.01 |
73 | SESR_S_pt | 360x640 | 7.48 | 350 | 185.58 |
74 | sp_net | 128x224 | 0.55 | 350 | 6753.90 |
75 | squeezenet | 227x227 | 0.76 | 350 | 7672.00 |
76 | squeezenet_pt | 224x224 | 0.82 | 350 | 7132.38 |
77 | ssd_adas_pruned_0_95 | 360x480 | 6.3 | 350 | 714.24 |
78 | ssd_inception_v2_coco_tf | 300x300 | 9.6 | 350 | 240.00 |
79 | ssdlite_mobilenet_v2_coco_tf | 300x300 | 1.5 | 350 | 1482.77 |
80 | ssd_mobilenet_v1_coco_tf | 300x300 | 2.5 | 350 | 2338.31 |
81 | ssd_mobilenet_v2 | 360x480 | 6.6 | 350 | 590.97 |
82 | ssd_mobilenet_v2_coco_tf | 300x300 | 3.8 | 350 | 1287.66 |
83 | ssd_pedestrian_pruned_0_97 | 360x360 | 5.9 | 350 | 605.79 |
84 | ssd_resnet_50_fpn_coco_tf | 640x640 | 178.4 | 350 | 108.19 |
85 | ssd_traffic_pruned_0_9 | 360x480 | 11.6 | 350 | 697.63 |
86 | tiny_yolov3_vmss | 416x416 | 5.46 | 350 | 1971.40 |
87 | tsd_yolox_pt | 640x640 | 73 | 350 | 263.49 |
88 | ultrafast_pt | 288x800 | 8.4 | 350 | 1061.32 |
89 | unet_chaos-CT_pt | 512x512 | 23.3 | 350 | 201.87 |
90 | vgg_16_tf | 224x224 | 31 | 350 | 505.57 |
91 | vgg_19_tf | 224x224 | 39.3 | 350 | 450.82 |
92 | vpgnet_pruned_0_99 | 480x640 | 2.5 | 350 | 443.39 |
93 | yolov2_voc | 448x448 | 34 | 350 | 825.49 |
94 | yolov2_voc_pruned_0_66 | 448x448 | 11.6 | 350 | 1208.89 |
95 | yolov2_voc_pruned_0_71 | 448x448 | 9.9 | 350 | 1351.28 |
96 | yolov2_voc_pruned_0_77 | 448x448 | 7.8 | 350 | 1327.28 |
97 | yolov3_adas_pruned_0_9 | 256x512 | 5.5 | 350 | 1120.72 |
98 | yolov3_bdd | 288x512 | 53.7 | 350 | 320.06 |
99 | yolov3_voc | 416x416 | 65.4 | 350 | 391.59 |
100 | yolov3_voc_tf | 416x416 | 65.6 | 350 | 392.26 |
101 | yolov4_leaky_spp_m | 416x416 | 60.1 | 350 | 326.98 |
102 | yolov4_leaky_spp_m_pruned_0_36 | 416x416 | 38.2 | 350 | 313.13 |
VCK5000 Performance with 8PE350 MHz DPUCVDX8H
The following table lists the throughput performance (in frames/sec or fps) for various neural network samples on the Versal ACAP VCK5000 Gen3x16 with DPUCVDX8H running at 8PE@350 MHz.
No | Neural Network | Input Size | GOPS | DPU Frequency (MHz) | Performance (fps) (Multiple thread) |
---|---|---|---|---|---|
1 | densebox_320_320 | 320x320 | 0.49 | 350 | 5972.1 |
2 | densebox_640_360 | 360x640 | 1.1 | 350 | 3054.66 |
3 | drunet_pt | 528x608 | 2.59 | 350 | 204.256 |
4 | ENet_cityscapes_pt | 512x1024 | 8.6 | 350 | 147.573 |
5 | face_landmark | 96x72 | 0.14 | 350 | 20320.8 |
6 | face-quality | 80x60 | 0.06 | 350 | 31443.7 |
7 | face-quality_pt | 80x60 | 0.06 | 350 | 31639.6 |
8 | FairMot_pt | 640x480 | 36 | 350 | 500.229 |
9 | fpn | 256x512 | 8.9 | 350 | 1034.06 |
10 | FPN_Res18_Medical_segmentation | 320x320 | 45.3 | 350 | 554.919 |
11 | FPN-resnet18_covid19-seg_pt | 352x352 | 22.7 | 350 | 1176.98 |
12 | inception_v1 | 224x224 | 3.2 | 350 | 3967.01 |
13 | inception_v1_tf | 224x224 | 3 | 350 | 4204.56 |
14 | medical_seg_cell_tf2 | 128x128 | 5.3 | 350 | 1511.86 |
15 | MLPerf_resnet50_v1.5_tf | 224x224 | 8.19 | 350 | 4505.04 |
16 | mlperf_ssd_resnet34_tf | 1200x1200 | 433 | 350 | 75.9417 |
17 | multi_task | 288x512 | 14.8 | 350 | 715.768 |
18 | openpose_pruned_0_3 | 368x368 | 49.9 | 350 | 170.05 |
19 | plate_detection | 320x320 | 0.49 | 350 | 8169.45 |
20 | plate_num | 96x288 | 1.75 | 350 | 2995.55 |
21 | refinedet_baseline | 480x360 | 123 | 350 | 285.417 |
22 | RefineDet-Medical_EDD_tf | 320x320 | 9.8 | 350 | 1272.02 |
23 | refinedet_pruned_0_8 | 360x480 | 25 | 350 | 667.239 |
24 | refinedet_pruned_0_92 | 360x480 | 10.1 | 350 | 854.083 |
25 | refinedet_pruned_0_96 | 360x480 | 5.1 | 350 | 880.724 |
26 | refinedet_VOC_tf | 320x320 | 81.9 | 350 | 398.164 |
27 | reid | 80x160 | 0.95 | 350 | 10771.9 |
28 | resnet18 | 224x224 | 3.7 | 350 | 6621.65 |
29 | resnet50 | 224x224 | 7.7 | 350 | 4941.68 |
30 | resnet50_pt | 224x224 | 4.1 | 350 | 4529.6 |
31 | resnet50_tf2 | 224x224 | 7.7 | 350 | 4941.54 |
32 | resnet_v1_101_tf | 224x224 | 14.4 | 350 | 2975 |
33 | resnet_v1_152_tf | 224x224 | 21.8 | 350 | 2120.41 |
34 | resnet_v1_50_tf | 224x224 | 7 | 350 | 4939.12 |
35 | salsanext_pt | 64x2048 | 20.4 | 350 | 158.533 |
36 | salsanext_v2_pt | 64x2048 | 32 | 350 | 87.6567 |
37 | SemanticFPN_cityscapes_pt | 256x512 | 10 | 350 | 1089.61 |
38 | semantic_seg_citys_tf2 | 512x1024 | 54 | 350 | 118.018 |
39 | SESR_S_pt | 360x640 | 7.48 | 350 | 232.841 |
40 | sp_net | 128x224 | 0.55 | 350 | 8444.81 |
41 | squeezenet | 227x227 | 0.76 | 350 | 8768.06 |
42 | squeezenet_pt | 224x224 | 0.82 | 350 | 7579.4 |
43 | ssd_adas_pruned_0_95 | 360x480 | 6.3 | 350 | 925.384 |
44 | ssd_pedestrian_pruned_0_97 | 360x360 | 5.9 | 350 | 751.791 |
45 | ssd_resnet_50_fpn_coco_tf | 640x640 | 178.4 | 350 | 114.255 |
46 | ssd_traffic_pruned_0_9 | 360x480 | 11.6 | 350 | 823.661 |
47 | tiny_yolov3_vmss | 416x416 | 5.46 | 350 | 2590.73 |
48 | ultrafast_pt | 288x800 | 8.4 | 350 | 1373.24 |
49 | unet_chaos-CT_pt | 512x512 | 23.3 | 350 | 246.539 |
50 | vpgnet_pruned_0_99 | 480x640 | 2.5 | 350 | 619.569 |
51 | yolov2_voc | 448x448 | 34 | 350 | 961.635 |
52 | yolov2_voc_pruned_0_66 | 448x448 | 11.6 | 350 | 1531.11 |
53 | yolov2_voc_pruned_0_71 | 448x448 | 9.9 | 350 | 1764.26 |
54 | yolov2_voc_pruned_0_77 | 448x448 | 7.8 | 350 | 1535.86 |
55 | yolov3_adas_pruned_0_9 | 256x512 | 5.5 | 350 | 1441.31 |
56 | yolov3_bdd | 288x512 | 53.7 | 350 | 399.008 |
57 | yolov3_voc | 416x416 | 65.4 | 350 | 475.684 |
58 | yolov3_voc_tf | 416x416 | 65.6 | 350 | 477.083 |