The Canny edge detector finds the edges in an image or video frame. It is one of the most popular algorithms for edge detection. Canny algorithm aims to satisfy three main criteria:
- Low error rate: A good detection of only existent edges.
- Good localization: The distance between edge pixels detected and real edge pixels have to be minimized.
- Minimal response: Only one detector response per edge.
In this algorithm, the noise in the image is reduced first by applying a Gaussian mask. The Gaussian mask used here is the average mask of size 3x3. Thereafter, gradients along x and y directions are computed using the Sobel gradient function. The gradients are used to compute the magnitude and phase of the pixels. The phase is quantized and the pixels are binned accordingly. Non-maximal suppression is applied on the pixels to remove the weaker edges.
Edge tracing is applied on the remaining pixels to draw the edges on the image. In this algorithm, the canny up to non-maximal suppression is in one kernel and the edge linking module is in another kernel. After non-maxima suppression, the output is represented as 2-bit per pixel, Where:
00
- represents the background01
- represents the weaker edge11
- represents the strong edge
The output is packed as 8-bit (four 2-bit pixels) in 1 pixel per cycle operation and packed as 16-bit (eight 2-bit pixels) in 8 pixel per cycle operation. For the edge linking module, the input is 64-bit, such 32 pixels of 2-bit are packed into a 64-bit. The edge tracing is applied on the pixels and returns the edges in the image.
API Syntax
The .. rubric:: API Syntax for Canny
is:
template<int FILTER_TYPE,int NORM_TYPE,int SRC_T,int DST_T, int ROWS, int COLS,int NPC,int NPC1,bool USE_URAM=false, int XFCVDEPTH_IN_1 = _XFCVDEPTH_DEFAULT, int XFCVDEPTH_OUT_1 = _XFCVDEPTH_DEFAULT>
void Canny(xf::cv::Mat<SRC_T, ROWS, COLS, NPC, XFCVDEPTH_IN_1> & _src_mat,xf::cv::Mat<DST_T, ROWS, COLS, NPC1, XFCVDEPTH_OUT_1> & _dst_mat,unsigned char _lowthreshold,unsigned char _highthreshold)
The .. rubric:: API Syntax for EdgeTracing
is:
template<int SRC_T, int DST_T, int ROWS, int COLS,int NPC_SRC,int NPC_DST,bool USE_URAM=false, int depthm = -1>
void EdgeTracing(xf::cv::Mat<SRC_T, ROWS, COLS, NPC_SRC, depthm> & _src,xf::cv::Mat<DST_T, ROWS, COLS, NPC_DST, depthm> & _dst)
Parameter Descriptions
The following table describes the xf::cv::Canny
template and function
parameters:
Parameter | Description |
---|---|
FILTER_TYPE | The filter window dimensions. The options are 3 and 5. |
NORM_TYPE | The type of norm used. The options for norm type are L1NORM and L2NORM. |
SRC_T | Input pixel type. Only 8-bit, unsigned, 1 channel is supported (XF_8UC1) |
DST_T | Output pixel type. Only XF_2UC1 is supported. The output in case of NPC=XF_NPPC1 is 8-bit and packing four 2-bit pixel values into 8-bit. The output in case of NPC=XF_NPPC8 is 16-bit, 8-bit, 2-bit pixel values are packing into 16-bit. |
ROWS | Maximum height of input and output image |
COLS | Maximum width of input and output image (must be a multiple of 8, in case of 8 pixel mode) |
NPC | Number of pixels to be processed per cycle; possible options are XF_NPPC1 and XF_NPPC8 for 1 pixel and 8 pixel operations respectively. In XF_NPPC, the output image pixels are packed and precision is XF_NPPC4. In XF_NPPC8, output pixels precision is XF_NPPC8. |
NPC1 | The output NPC is 32.Packing 2bit, 32 pixels into 64 bit pointer |
USE_URAM | Enable to map some storage structures to URAM |
XFCVDEPTH_IN_1 | Depth of input image |
XFCVDEPTH_OUT_1 | Depth of output image |
_src_mat | Input image |
_dst_mat | Output image |
_lowthreshold | The lower value of threshold for binary thresholding. |
_highthreshold | The higher value of threshold for binary thresholding. |
The following table describes the EdgeTracing
template and function
parameters:
Parameter | Description |
---|---|
SRC_T | Input pixel type |
DST_T | Output pixel type |
ROWS | Maximum height of input and output image |
COLS | Maximum width of input and output image (must be a multiple of 32) |
NPC_SRC | Number of pixels to be processed per cycle. Fixed to XF_NPPC32. |
NPC_DST | Number of pixels to be written to destination. Fixed to XF_NPPC8. |
USE_URAM | Enable to map storage structures to URAM. |
depthm | Depth of input image |
_src | Input image |
_dst | Output image |
Resource Utilization
The following table summarizes the resource utilization of xf::cv::Canny
and EdgeTracing
in different configurations, generated using Vivado
HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1 FPGA, to process a
grayscale HD (1080x1920) image for Filter size is 3.
Name | Resource Utilization | |||||
---|---|---|---|---|---|---|
1 pixel | 1 pixel | 8 pixel | 8 pixel | Edge Linking | Edge Linking | |
L1NORM,FS:3 | L2NORM,FS:3 | L1NORM,FS:3 | L2NORM,FS:3 | |||
300 MHz | 300 MHz | 150 MHz | 150 MHz | 300 MHz | 150 MHz | |
BRAM_18K | 22 | 18 | 36 | 32 | 84 | 84 |
DSP48E | 2 | 4 | 16 | 32 | 3 | 3 |
FF | 3027 | 3507 | 4899 | 6208 | 17600 | 14356 |
LUT | 2626 | 3170 | 6518 | 9560 | 15764 | 14274 |
CLB | 606 | 708 | 1264 | 1871 | 2955 | 3241 |
The following table summarizes the resource utilization of xf::cv::Canny
and EdgeTracing
in different configurations, generated using Vivado HLS
2019.1 tool for the xczu7ev-ffvc1156-2-e FPGA, to process a grayscale 4K
image for Filter size is 3.
Name | Resource Utilization | |||||
---|---|---|---|---|---|---|
1 pixel | 1 pixel | 8 pixel | 8 pixel | Edge Linking | Edge Linking | |
L1NORM,FS:3 | L2NORM,FS:3 | L1NORM,FS:3 | L2NORM,FS:3 | |||
300 MHz | 300 MHz | 150 MHz | 150 MHz | 300 MHz | 150 MHz | |
BRAM_18K | 10 | 8 | 3 | 3 | 4 | 4 |
URAM | 1 | 1 | 15 | 13 | 8 | 8 |
DSP48E | 2 | 4 | 16 | 32 | 8 | 8 |
FF | 3184 | 3749 | 5006 | 7174 | 5581 | 7054 |
LUT | 2511 | 2950 | 6695 | 9906 | 4092 | 6380 |
Performance Estimate
The following table summarizes the performance of the kernel in different configurations, as generated using Vivado HLS 2019.1 tool for the Xczu9eg-ffvb1156-1-i-es1, to process a grayscale HD (1080x1920) image for L1NORM, filter size is 3 and including the edge linking module.
Operating Mode | Latency Estimate |
---|---|
Max Latency (ms) | |
1 pixel operation (300 MHz) | 10.8 |
8 pixel operation (150 MHz) | 8.5 |
Deviation from OpenCV
In OpenCV Canny function, the Gaussian blur is not applied as a pre-processing step.