The Xilinx® Alveo U50 Data Center accelerator cards are peripheral component interconnect express ( PCIe® ) Gen3x16 compliant and Gen4x8 compatible cards featuring the Xilinx 16 nm UltraScale+ technology. In this release, DPUv3e is implemented in program logic for deep learning inference acceleration.
See the following table for the throughput performance (in frames/sec or fps) for various neural network samples on U50 Gen3x16 with DPUv3e running at 250 MHz.
No | Neural Network | Input Size | GOPS | Performance (fps) (Single thread) | Performance (fps) (Multiple thread) |
---|---|---|---|---|---|
1 | inception_resnet_v2_tf | 299x299 | 26.4 | 21.9 | 46.5 |
2 | inception_v1_tf | 224x224 | 3.0 | 227.7 | 682 |
3 | inception_v3_tf | 299x299 | 11.5 | 57 | 133.3 |
4 | inception_v4_2016_09_09_tf | 299x299 | 24.6 | 28.5 | 61.3 |
5 | mobilenet_v1_0_25_128_tf | 128x128 | 0.027 | N/A | N/A |
6 | mobilenet_v1_0_5_160_tf | 160x160 | 0.15 | N/A | N/A |
7 | mobilenet_v1_1_0_224_tf | 224x224 | 1.1 | N/A | N/A |
8 | mobilenet_v2_1_0_224_tf | 224x224 | 0.60 | N/A | N/A |
9 | mobilenet_v2_1_4_224_tf | 224x224 | 1.2 | N/A | N/A |
10 | resnet_v1_101_tf | 224x224 | 14.4 | 104.9 | 247 |
11 | resnet_v1_152_tf | 224x224 | 21.8 | 74 | 165.8 |
12 | resnet_v1_50_tf | 224x224 | 7.0 | 169.9 | 460.7 |
13 | vgg_16_tf | 224x224 | 31.0 | 62.3 | 137.9 |
14 | vgg_19_tf | 224x224 | 39.3 | 52.6 | 114.4 |
15 | ssd_mobilenet_v1_coco_tf | 300x300 | 2.5 | N/A | N/A |
16 | ssd_mobilenet_v2_coco_tf | 300x300 | 3.8 | N/A | N/A |
17 | ssd_resnet_50_fpn_coco_tf | 640x640 | 178.4 | 5.8 | 15.2 |
18 | yolov3_voc_tf | 416x416 | 65.6 | 24.6 | 54.7 |
19 | mlperf_ssd_resnet34_tf | 1200x1200 | 433 | N/A | N/A |
20 | resnet50 | 224x224 | 7.7 | 166.4 | 394 |
21 | resnet18 | 224x224 | 3.7 | 334.6 | 995 |
22 | inception_v1 | 224x224 | 3.2 | 212.1 | 551 |
23 | inception_v2 | 224x224 | 4.0 | 175.5 | 426.4 |
24 | inception_v3 | 299x299 | 11.4 | 60.5 | 133.3 |
25 | inception_v4 | 299x299 | 24.5 | 29.4 | 61.5 |
26 | mobilenet_v2 | 224x224 | 0.6 | N/A | N/A |
27 | squeezenet | 227x227 | 0.76 | 166.3 | 418 |
28 | ssd_pedestrain_pruned_0_97 | 360x360 | 5.9 | 33.9 | 83.1 |
29 | ssd_traffic_pruned_0_9 | 360x480 | 11.6 | 36.3 | 91.1 |
30 | ssd_adas_pruned_0_95 | 360x480 | 6.3 | 46.5 | 118.1 |
31 | ssd_mobilenet_v2 | 360x480 | 6.6 | N/A | N/A |
32 | refinedet_pruned_0_8 | 360x480 | 25 | 28.7 | 64.4 |
33 | refinedet_pruned_0_92 | 360x480 | 10.1 | 41.3 | 97.6 |
34 | refinedet_pruned_0_96 | 360x480 | 5.1 | 41 | 96.8 |
35 | vpgnet_pruned_0_99 | 480x640 | 2.5 | 28.3 | 65.1 |
36 | fpn | 256x512 | 8.9 | 37.3 | 116.9 |
37 | sp_net | 128x224 | 0.55 | 405.5 | 1074.5 |
38 | openpose_pruned_0_3 | 368x368 | 49.9 | 8.7 | 22.3 |
39 | densebox_320_320 | 320x320 | 0.49 | 238.4 | 796.2 |
40 | densebox_640_360 | 360x640 | 1.1 | 103.3 | 360.6 |
41 | face_landmark | 96x72 | 0.14 | 2107.7 | 6631.9 |
42 | reid | 80x160 | 0.95 | 751.4 | 2301 |
43 | multi_task | 288x512 | 14.8 | 24.9 | 70.8 |
44 | yolov3_adas_pruned_0_9 | 256x512 | 5.5 | 33.1 | 75.4 |
45 | yolov3_voc | 416x416 | 65.4 | 24.5 | 54.9 |
46 | yolov3_bdd | 288x512 | 53.7 | 19.2 | 42 |
47 | yolov2_voc | 448x448 | 34 | 50.9 | 141.6 |
48 | yolov2_voc_pruned_0_66 | 448x448 | 11.6 | 63.4 | 187.5 |
49 | yolov2_voc_pruned_0_71 | 448x448 | 9.9 | 66.3 | 203 |
50 | yolov2_voc_pruned_0_77 | 448x448 | 7.8 | 70.9 | 227.9 |