DPU Configuration - 1.1 English

There is an option to determine the number of DPU engines that will be instantiated in a single DPU IP. The deep neural network features and the associated parameters supported by the DPU are shown in the following table.

Table 1. Deep Neural Network Features and Parameters Supported by DPU
Features	Description (channel_parallel=16)
conv2d	Kernel Sizes	kernel_w: [1, 16] kernel_h: [1, 16]
	Strides	stride_w: [1, 4] stride_h: [1, 4]
	Pad_left/Pad_right	[0, (kernel_w - 1) * dilation_w + 1]
	Pad_top/Pad_bottom	[0, (kernel_h - 1) * dilation_h + 1]
	In Size	kernel_w * kernel_h * ceil(input_channel / channel_parallel) <= 2048
	Out Size	output_channel <= 256 * channel_parallel
	Activation	ReLU, LeakyReLU, ReLU6
	Dilation	dilation * input_channel <= 256 * channel_parallel
depthwise-conv2d	Kernel Sizes	kernel_w: [1,3] kernel_h: [1,3]
	Strides	stride_w: [1, 2] stride_h: [1, 2]
	Pad_left/Pad_right	[1, (kernel_w - 1)]
	Pad_top/Pad_bottom	[1, (kernel_h - 1)]
	In Size	kernel_w * kernel_h * ceil(input_channel / channel_parallel) <= 2048
	Out Size	output_channel <= 256 * channel_parallel
	Activation	ReLU, ReLU6
transposed-conv2d	Kernel Sizes	kernel_w: [1, 16] kernel_h: [1, 16]
	Strides	stride_w: [1, 16] stride_h: [1, 16]
	Pad_left/Pad_right	[1, kernel_w-1]
	Pad_top/Pad_bottom	[1, kernel_h-1]
	Out Size	output_channel <= 256 * channel_parallel
	Activation	ReLU, LeakyReLU, ReLU6
depthwise-transposed-conv2d	Kernel Sizes	kernel_w: [3] kernel_h: [3]
	Strides	stride_w: [1] stride_h: [1]
	Pad_left/Pad_right	[1, kernel_w-1]
	Pad_top/Pad_bottom	[1, kernel_h-1]
	Out Size	output_channel <= 256 * channel_parallel
	Activation	ReLU, ReLU6
average-pooling	Kernel Sizes	kernel_w: [1, 8] kernel_h: [1, 8] kernel_w==kernel_h
	Strides	stride_w: [1, 8] stride_h: [1, 8]
	Pad_left/Pad_right	[1, kernel_w-1]
	Pad_top/Pad_bottom	[1, kernel_h-1]
elementwise-sum	Input channel	input_channel <= 256 *channel_parallel[1, 8912]
	Activation	ReLU
	Concat	Network-specific limitation related to the size of feature maps, quantization results, and compiler optimizations.
Fully Connected	Input Channel	input channel <= 161616