Dense Pyramidal LK Optical Flow - 2024.1 English

Vitis Libraries

Release Date
2024-05-30
Version
2024.1 English

Optical flow is the pattern of apparent motion of image objects between two consecutive frames, caused by the movement of object or camera. It is a 2D vector field, where each vector is a displacement vector showing the movement of points from first frame to second.

Optical Flow works on the following assumptions:

  • Pixel intensities of an object do not have too many variations in consecutive frames
  • Neighboring pixels have similar motion
Consider a pixel I(x, y, t) in first frame. (Note that a new dimension, time, is added here. When working with images only, there is no need of time). The pixel moves by distance (dx, dy) in the next frame taken after time dt. Thus, since those pixels are the same and the intensity does not change, the following is true:
image92

Taking the Taylor series approximation on the right-hand side, removing common terms, and dividing by dt gives the following equation:

image93

Where image94, image95, image96 and image97.

The above equation is called the Optical Flow equation, where, fx and fy are the image gradientsand ft is the gradient along time. However, (u, v) is unknown. It is not possible to solve this equation with two unknown variables. Thus, several methods are provided to solve this problem. One method is Lucas-Kanade. Previously it was assumed that all neighboring pixels have similar motion. The Lucas-Kanade method takes a patch around the point, whose size can be defined through the ‘WINDOW_SIZE’ template parameter. Thus, all the points in that patch have the same motion. It is possible to find (fx, fy, ft ) for these points. Thus, the problem now becomes solving ‘WINDOW_SIZE * WINDOW_SIZE’ equations with two unknown variables,which is over-determined. A better solution is obtained with the “least square fit” method. Below is the final solution, which is a problem with two equations and two unknowns:

image98

This solution fails when a large motion is involved and so pyramids are used. Going up in the pyramid, small motions are removed and large motions become small motions and so by applying Lucas-Kanade, the optical flow along with the scale is obtained.

API Syntax

template< int NUM_PYR_LEVELS, int NUM_LINES, int WINSIZE, int FLOW_WIDTH, int FLOW_INT, int TYPE, int ROWS, int COLS, int NPC T, bool USE_URAM=false, int XFCVDEPTH_CURR_IMG = _XFCVDEPTH_DEFAULT, int XFCVDEPTH_NEXT_IMG = _XFCVDEPTH_DEFAULT, int XFCVDEPTH_STREAM_IN = _XFCVDEPTH_DEFAULT, int XFCVDEPTH_STREAM_OUT = _XFCVDEPTH_DEFAUL>
void densePyrOpticalFlow(
xf::cv::Mat<TYPE,ROWS,COLS,NPC, XFCVDEPTH_CURR_IMG> & _current_img,
xf::cv::Mat<TYPE,ROWS,COLS,NPC, XFCVDEPTH_NEXT_IMG> & _next_image,
xf::cv::Mat<XF_32UC1,ROWS,COLS,NPC, XFCVDEPTH_STREAM_IN> & _streamFlowin,
xf::cv::Mat<XF_32UC1,ROWS,COLS,NPC, XFCVDEPTH_STREAM_OUT> & _streamFlowout,
const int level, const unsigned char scale_up_flag, float scale_in, ap_uint<1> init_flag)

Parameter Descriptions

The following table describes the template and the function parameters.

Table 625 Table . densePyrOpticalFlow Function Parameter Descriptions
Parameter Description
NUM_PYR_LEVE LS Number of Image Pyramid levels used for the optical flow computation
NUM_LINES Number of lines to buffer for the remap algorithm – used to find the temporal gradient
WINSIZE Window Size over which Optical Flow is computed
FLOW_WIDTH, FLOW_INT

Data width and number of integer bits to define the signed flow vector data type. Integer bit includes the signed bit.

The default type is 16-bit signed word with 10 integer bits and 6 decimal bits.

TYPE Pixel type of the input image. XF_8UC1 is only the supported value.
ROWS Maximum height or number of rows to build the hardware for this kernel
COLS Maximum width or number of columns to build the hardware for this kernel
NPC Number of pixels the hardware kernel must process per clock cycle. Only XF_NPPC1, 1 pixel per cycle, is supported.
USE_URAM Enable to map some storage structures to UltraRAM
XFCVDEPTH_CURR_IMG Depth of the input image.
XFCVDEPTH_NEXT_IMG Depth of the output image.
XFCVDEPTH_STREAM_IN Depth of the input image.
XFCVDEPTH_STREAM_OUT Depth of the output image.
_current_img First input image stream
_next_img Second input image to which the optical flow is computed with respect to the first image
_streamFlow in 32-bit Packed U and V flow vectors input for optical flow. The bits from 31-16 represent the flow vector U while the bits from 15-0 represent the flow vector V.
_streamFlow out 32-bit Packed U and V flow vectors output after optical flow computation. The bits from 31-16 represent the flow vector U while the bits from 15-0 represent the flow vector V.
level Image pyramid level at which the algorithm is currently computing the optical flow.
scale_up_fla g Flag to enable the scaling-up of the flow vectors. This flag is set at the host when switching from one image pyramid level to the other.
scale_in Floating point scale up factor for the scaling-up the flow vectors. The value is (previous_rows-1)/(current_rows-1). This is not 1 when switching from one image pyramid level to the other.
init_flag Flag to initialize flow vectors to 0 in the first iteration of the highest pyramid level. This flag must be set in the first iteration of the highest pyramid level (smallest image in the pyramid). The flag must be unset for all the other iterations.

Resource Utilization

The following table summarizes the resource utilization of densePyrOpticalFlow for 1 pixel per cycle implementation, with the optical flow computed for a window size of 11 over an image size of 1920x1080 pixels. The results are after implementation in Vivado HLS 2019.1 for the Xilinx xczu9eg-ffvb1156-2L-e FPGA at 300 MHz.

Table 626 Table . densePyrOpticalFlow Function Resource Utilization Summary
Operating Mode

Operating Frequency

(MHz)

Utilization Estimate
LUTs FFs DSPs BRAMs
1 Pixel 300 32231 16596 52 215

Resource Utilization with UltraRAM Enable

The following table summarizes the resource utilization of densePyrOpticalFlow for 1 pixel per cycle implementation, with the optical flow computed for a window size of 11 over an image size of 3840X2160 pixels. The results are after implementation in Vivado HLS 2019.1 for the Xilinx xczu7ev-ffvc1156-2 FPGA at 300 MHz with UltraRAM enabled.

Table 627 Table . densePyrOpticalFlow Function Resource Utilization Summary
Operating Mode

Operating Frequency

(MHz)

Utilization Estimate
LUTs FFs DSPs BRAMs URAM
1 Pixel 300 31164 42320 81 34 23

Performance Estimate

The following table summarizes performance figures on hardware for the densePyrOpticalFlow function for 5 iterations over 5 pyramid levels scaled down by a factor of two at each level. This has been tested on the zcu102 evaluation board.

Table 628 Table . densePyrOpticalFlow Function Performance Estimate Summary
Operating Mode

Operating Frequency

(MHz)

Image Size Latency Estimate
Max (ms)
1 pixel 300 1920x1080 49.7
1 pixel 300 1280x720 22.9
1 pixel 300 1226x370 12.02