Typical Operation Type in CNN | Parameters | DPUCZDX8G_ISA0_B4096_MAX_BG2 (ZCU102, ZCU104) | DPUCAHX8L_ISA0 (U50, U50LV, U280) | DPUCVDX8G_ISA1_C32B3 (VCK190) | DPUCAHX8H_ISA2 (U50, U50LV9E, U50LV10E, U280) | DPUCADF8H_ISA0 (U200, U250) | DPUCVDX8H_ISA1_F2W2 (VCK5000) |
---|---|---|---|---|---|---|---|
Intrinsic Parameter | channel_parallel: 16 bank_depth: 2048 |
channel_parallel: 32 bank_depth: 4096 |
channel_parallel: 16 bank_depth: 16384 |
channel_parallel: 16 bank_depth: 2048 |
channel_parallel: 16 bank_depth: 8192 |
channel_parallel: 64 bank_depth: 256 |
|
conv2d | Kernel size | w, h: [1, 16] | w, h: [1, 16] | w, h: [1, 16] w * h <= 64 |
w, h: [1, 16] | w, h: [1, 16] | w, h: [1, 16] |
Strides | w, h: [1, 8] | w, h: [1, 4] | w, h: [1, 8] | w, h: [1, 4] | w, h: [1, 8] | w, h: [1, 4] | |
Dilation | dilation * input_channel <= 256 * channel_parallel | ||||||
Paddings | pad_left, pad_right: [0, (kernel_w - 1) * dilation_w] | ||||||
pad_top, pad_bottom: [0, (kernel_h - 1) * dilation_h] | |||||||
In Size | kernel_w * kernel_h * ceil(input_channel / channel_parallel) <= bank_depth | ||||||
Out Size | output_channel <= 256 * channel_parallel | ||||||
Activation | ReLU, LeakyReLU, ReLU6 | ReLU, ReLU6 | ReLU, LeakyReLU, ReLU6, Hard-Swish, Hard-Sigmoid | ReLU, LeakyReLU, ReLU6 | ReLU, LeakyReLU | ReLU, LeakyReLU | |
Group* (Caffe) | group==1 | ||||||
depthwise-conv2d | Kernel size | w, h: [1, 16] | w, h: [3] | w, h: [1, 256] | Not supported | ||
Strides | w, h: [1, 8] | w, h: [1, 2] | w, h: [1, 8] | ||||
dilation | dilation * input_channel <= 256 * channel_parallel | ||||||
Paddings | pad_left, pad_right: [0, (kernel_w - 1) * dilation_w] | pad_left, pad_right: [0, 15 * dilation_w] | |||||
pad_top, pad_bottom: [0, (kernel_h - 1) * dilation_h] | pad_top, pad_bottom: [0, 15 * dilation_h] | ||||||
In Size | kernel_w * kernel_h * ceil(input_channel / channel_parallel) <= bank_depth | ||||||
Out Size | output_channel <= 256 * channel_parallel | ||||||
Activation | ReLU, ReLU6 | ReLU, ReLU6 | ReLU, ReLU6 | ||||
Group* (Caffe) | group==input_channel | ||||||
transposed-conv2d | Kernel size | kernel_w/stride_w, kernel_h/stride_h: [1, 16] | |||||
Strides | |||||||
Paddings | pad_left, pad_right: [1, kernel_w-1] | ||||||
pad_top, pad_bottom: [1, kernel_h-1] | |||||||
Out Size | output_channel <= 256 * channel_parallel | ||||||
Activation | ReLU, LeakyReLU, ReLU6 | ReLU, ReLU6 | ReLU, LeakyReLU, ReLU6, Hard-Swish, Hard-Sigmoid | ReLU, LeakyReLU, ReLU6 | ReLU, LeakyReLU | ReLU, LeakyReLU | |
depthwise-transposed-conv2d | Kernel size | kernel_w/stride_w, kernel_h/stride_h: [1, 16] | kernel_w/stride_w, kernel_h/stride_h: [3] | kernel_w/stride_w, kernel_h/stride_h: [1, 256] | Not supported | ||
Strides | |||||||
Paddings | pad_left, pad_right: [1, kernel_w-1] | pad_left, pad_right: [1, 15] | |||||
pad_top, pad_bottom: [1, kernel_h-1] | pad_top, pad_bottom: [1, 15] | ||||||
Out Size | output_channel <= 256 * channel_parallel | ||||||
Activation | ReLU, ReLU6 | ReLU, ReLU6 | ReLU, ReLU6 | ||||
max-pooling | Kernel size | w, h: [2, 8] | w, h: {2, 3, 5, 7, 8} | w, h: [1, 256] | w, h: [1, 8] | w, h: [1, 16] | w, h: {1, 2, 3, 7} |
Strides | w, h: [1, 8] | w, h: [1, 8] | w, h: [1, 8] | w, h: [1, 8] | w, h: [1, 8] | w, h: [1, 8] | |
Paddings | pad_left, pad_right: [1, kernel_w-1] | pad_left, pad_right: [1, 15] | pad_left, pad_right: [1, kernel_w-1] | ||||
pad_top, pad_bottom: [1, kernel_h-1] | pad_top, pad_bottom: [1, 15] | pad_top, pad_bottom: [1, kernel_h-1] | |||||
Activation | ReLU | not supported | ReLU, ReLU6 | not supported | ReLU | not supported | |
average-pooling | Kernel size | w, h: [2, 8] w==h |
w, h: {2, 3, 5, 7, 8} w==h |
w, h: [1, 256] | w, h: [1, 8] w==h |
w, h: [1, 16] | w, h: {1, 2, 3, 7} w==h |
Strides | w, h: [1, 8] | w, h: [1, 8] | w, h: [1, 8] | w, h: [1, 8] | w, h: [1, 8] | w, h: [1, 8] | |
Paddings | pad_left, pad_right: [1, kernel_w-1] | pad_left, pad_right: [1, 15] | pad_left, pad_right: [1, kernel_w-1] | ||||
pad_top, pad_bottom: [1, kernel_h-1] | pad_top, pad_bottom: [1, 15] | pad_top, pad_bottom: [1, kernel_h-1] | |||||
Activation | ReLU | not supported | ReLU, ReLU6 | not supported | ReLU | not supported | |
eltwise | type | sum | sum | sum, prod | sum | sum | sum |
Input Channel | input_channel <= 256 * channel_parallel | ||||||
Activation | ReLU | ReLU | ReLU | ReLU | ReLU | ReLU | |
concat | Network-specific limitation, which relates to the size of feature maps, quantization results and compiler optimizations. | ||||||
reorg | Strides | reverse==false :
stride ^ 2 * input_channel <= 256 * channel_parallel reverse==true : input_channel <= 256 * channel_parallel |
|||||
pad | In Size | input_channel <= 256 * channel_parallel | |||||
Mode | "SYMMETRIC" ("CONSTANT" pad(value=0) would be fused into adjacent operators during compiler optimization process) | ||||||
global pooling | Global pooling will be processed as general pooling with kernel size euqal to input tensor size. | ||||||
InnerProduct, Fully Connected, Matmul | These ops will be transformed into conv2d op |