VCK190 Performance - 1.3 English

Vitis AI Library User Guide (UG1354)

Document ID
UG1354
Release Date
2021-02-03
Version
1.3 English

VCK190 is the first Versal AI Core series evaluation kit, enabling designers to develop solutions using AI and DSP engines capable of delivering over 100X greater compute performance compared to current server class CPUs. For this release, a B8192F DPU core with batch=3 is implemented using AI Engine.

Refer to the following table for the throughput performance (in frames/sec or fps) for various neural network samples on VCK190 with DPU running at 1333 MHz.

Table 1. VCK190 Performance
No Neural Network Input Size GOPS DPU Frequency (MHz) Performance (fps) (Multiple thread)
1 densebox_320_320 320x320 0.49 1333 2243.8
2 densebox_640_360 360x640 1.1 1333 1067
3 ENet_cityscapes_pt 512x1024 8.6 1333 48.9
4 face_landmark 96x72 0.14 1333 9458.9
5 face-quality 80x60 0.06 1333 19703.5
6 face-quality_pt 80x60 0.06 1333 19783.5
7 facerec_resnet20 112x96 3.5 1333 2276.3
8 facerec-resnet20_mixed_pt 112x96 3.5 1333 2277.1
9 facerec_resnet64 112x96 11 1333 1143.7
10 facereid-large_pt 96x96 0.5 1333 11714
11 facereid-small_pt 80x80 0.09 1333 17743.8
12 fpn 256x512 8.9 1333 196.7
13 FPN_Res18_Medical_segmentation 320x320 45.3 1333 186.6
14 FPN-resnet18_covid19-seg_pt 352x352 22.7 1333 540.3
15 inception_resnet_v2_tf 299x299 26.4 1333 299.3
16 inception_v1 224x224 3.2 1333 1553
17 inception_v1_tf 224x224 3 1333 1577.6
18 inception_v2 224x224 4 1333 1137.8
19 inception_v3 299x299 11.4 1333 591
20 inception_v3_pt 299x299 5.7 1333 590.7
21 inception_v3_tf 299x299 11.5 1333 596.2
22 inception_v3_tf2 299x299 11.5 1333 622.1
23 inception_v4 299x299 24.5 1333 297.6
24 inception_v4_2016_09_09_tf 299x299 24.6 1333 297.5
25 medical_seg_cell_tf2 128x128 5.3 1333 1976.3
26 MLPerf_resnet50_v1.5_tf 224x224 8.19 1333 1304
27 mlperf_ssd_resnet34_tf 1200x1200 433 1333 26.2
28 multi_task 288x512 14.8 1333 210
29 openpose_pruned_0_3 368x368 49.9 1333 43.4
30 personreid-res18_pt 176x80 1.1 1333 4195.5
31 personreid-res50_pt 256x128 5.4 1333 1634.6
32 plate_detection 320x320 0.49 1333 2563.3
33 plate_num 96x288 1.75 1333 904
34 refinedet_baseline 480x360 123 1333 161.1
35 RefineDet-Medical_EDD_tf 320x320 9.8 1333 1020.8
36 refinedet_pruned_0_8 360x480 25 1333 516
37 refinedet_pruned_0_92 360x480 10.1 1333 737.4
38 refinedet_pruned_0_96 360x480 5.1 1333 890.8
39 refinedet_VOC_tf 320x320 81.9 1333 194.4
40 reid 80x160 0.95 1333 4317.1
41 resnet18 224x224 3.7 1333 2814
42 resnet50 224x224 7.7 1333 1310.2
43 resnet50_pt 224x224 4.1 1333 1282
44 resnet50_tf2 224x224 7.7 1333 1287.4
45 resnet_v1_101_tf 224x224 14.4 1333 794.7
46 resnet_v1_152_tf 224x224 21.8 1333 559.1
47 resnet_v1_50_tf 224x224 7 1333 1392.4
48 salsanext_pt 64x2048 20.4 1333 21.5
49 SemanticFPN_cityscapes_pt 256x512 10 1333 197.3
50 semantic_seg_citys_tf2 512x1024 54 1333 46.5
51 sp_net 128x224 0.55 1333 3724.2
52 squeezenet 227x227 0.76 1333 1773.9
53 squeezenet_pt 224x224 0.82 1333 1891.9
54 ssd_adas_pruned_0_95 360x480 6.3 1333 848.2
55 ssd_pedestrian_pruned_0_97 360x360 5.9 1333 740.1
56 ssd_resnet_50_fpn_coco_tf 640x640 178.4 1333 13.1
57 ssd_traffic_pruned_0_9 360x480 11.6 1333 681.2
58 tiny_yolov3_vmss 416x416 5.46 1333 1389.5
59 unet_chaos-CT_pt 512x512 23.3 1333 212.4
60 vgg_16_tf 224x224 31 1333 336.9
61 vgg_19_tf 224x224 39.3 1333 300.4
62 vpgnet_pruned_0_99 480x640 2.5 1333 703.1
63 yolov2_voc 448x448 34 1333 477.3
64 yolov2_voc_pruned_0_66 448x448 11.6 1333 943.2
65 yolov2_voc_pruned_0_71 448x448 9.9 1333 1058.2
66 yolov2_voc_pruned_0_77 448x448 7.8 1333 1190.8
67 yolov3_adas_pruned_0_9 256x512 5.5 1333 1083.7
68 yolov3_bdd 288x512 53.7 1333 221.4
69 yolov3_voc 416x416 65.4 1333 217
70 yolov3_voc_tf 416x416 65.6 1333 216.8
71 yolov4_leaky_spp_m 416x416 60.1 1333 181.1