Operators Supported by Caffe

Operators Supported by Caffe - 1.4.1 English

Vitis AI User Guide (UG1414)

Document ID

UG1414

Release Date

2021-12-13

Version

1.4.1 English

Table 1. Operators Supported by Caffe
Caffe		XIR		DPU Implementation
OP name	Attributes	OP name	Attributes	DPU Implementation
input	shape	data	shape	Allocate memory for input data.
input		data	data_type	Allocate memory for input data.
convolution	kernel_size	conv2d (group = 1) / depthwise-conv2d (group = input channel)	kernel	If group == input channel, the convolution would be compiled into Depthwise-Convolution Engine, if group == 1, the convolution would be mapped to Convolution Engine. Otherwise, it would be mapped to CPU.
	stride		stride
	pad		pad
			pad_mode (FLOOR)
	dilation		dilation
	bias_term
	num_output
	group
deconvolution	kernel_size	transposed-conv2d (group = 1) / depthwise-transposed-conv2d (group = input channel)	kernel	If group == input channel, the deconvolution would be compiled into Depthwise-Convolution Engine, if group == 1, the deconvolution would be mapped to Convolution Engine. Otherwise, it would be mapped to CPU.
	stride		stride
	pad		pad
			pad_mode (FLOOR)
	dilation		dilation
	bias_term
	num_output
	group
innerproduct	bias_term	conv2d / matmul	transpose_a	The inner-product would be transformed to matmul, then the matmul would be transformed to conv2d and compiled to Convolution Engine. If the inner-product fails to be transformed, it would be implemented by CPU.
innerproduct	num_output	conv2d / matmul	transpose_b
scale	bias_term	depthwise-conv2d / scale		The scale would be transformed to depthwise-convolution, otherwise, it would be mapped to CPU.
pooling	kernel_size	maxpool2d (pool_method = 0) / avgpool2d (pool_method = 1)	kernel_size	Pooling Engine.
	stride		stride
	global_pooling		global
	pad		pad
	pool_method		pad_mode(CEIL)
			count_include_pad (true)
			count_include_invalid (false)
eltwise	coeff = 1	add		Element-wise Add Engine.
eltwise	operation = SUM	add		Element-wise Add Engine.
concat	axis	concat	axis	Xilinx reduces the overhead resulting from the concat by special reading or writing strategies and allocate the on-chip memory carefully.
relu	negative_slope	relu / leakyrelu	alpha	Activations would be fused to adjacent operations such as convolution, add, etc.
relu6		relu6
fixneuron	bit_width	fix	bit_width	It would be divided into float2fix and fix2float during compilation, then the float2fix and fix2float operations would be fused with adjacent operations into course-grained operations.
	quantize_pos		fix_point
			if_signed
			round_mode
reshape	shape	reshape	shape	These operations are shape-related operations, they would be removed or transformed into reshape in most cases, which would not affect the on-chip data layout. Otherwise, they would be compiled to CPU.
permute	order	reshape / transpose	order
flatten	axis	reshape / flatten	start_axis
	end_axis		end_axis
reorg	strides	reorg	strides	If the reorg meets the hardware requirements, it would be mapped to DPU implementations.
reorg	reverse	reorg	reverse
deephiresize	scale	resize	size	If the mode of the resize is 'BILINEAR', align_corner=false, half_pixel_centers = false, size = 2, 4, 8; align_corner=false, half_pixel_centers = true, size = 2, 4 can be transformed to DPU implementations (pad+depthwise-transposed conv2d). If the mode of the resize is 'NEAREST' and the size is an integer, the resize would be mapped to DPU implementations.
	mode		mode
			align_corners=false
			half_pixel_centers=false
gstiling	strides	gstiling	stride	If the strides of gstiling are integers, it may be mapped into special DPU read/write instructions.
gstiling	reverse	gstiling	reverse
slice	axis	strided_slice	begin	They would only be compiled into CPU implementations.
	slice_point		end
			strides
priorbox	min_sizes	priorbox	min_sizes
	max_sizes		max_sizes
	aspect_ratio		aspect_ratio
	flip		flip
	clip		clip
	variance		variance
	step		step
	offset		offset
softmax	axis	softmax	axis