In standard convolution, each input channel needs to perform the operation with one specific kernel, and then the result is obtained by combining the results of all channels together.
In depthwise separable convolution, the operation is performed in two steps: depthwise convolution and pointwise convolution. Depthwise convolution is performed for each feature map separately as shown on the left side of the following figure. The next step is to perform pointwise convolution, which is the same as standard convolution with kernel size 1x1. The parallelism of depthwise convolution is half that of the pixel parallelism.
Figure 1. Depthwise Convolution and Pointwise Convolution
DPUCZDX8G Architecture | Extra LUTs | Extra Block RAMs | Extra DSPs |
---|---|---|---|
B512(4x8x8) | 1734 | 4 | 12 |
B800(4x10x10) | 2293 | 4.5 | 15 |
B1024(8x8x8) | 2744 | 4 | 24 |
B1152(4x12x12) | 2365 | 5.5 | 18 |
B1600(8x10x10) | 3392 | 4.5 | 30 |
B2304(8x12x12) | 3943 | 5.5 | 36 |
B3136(8x14x14) | 4269 | 6.5 | 42 |
B4096(8x16x16) | 4930 | 7.5 | 48 |