The classic Kalman Filter is proposed for linear system. The state-space description of a linear system assumed to be:
where xk is the state vector at kth time instant,
constant (known) Ak
is an nxn state transition matrix, constant
(known) Bk is an nxm control input matrix, constant (known)
Γk is an nxp system noise input matrix, constant (known)
Hk is a qxn measurement matrix, constant (known) with 1≤ m, p,
q ≤ n, {u:sub:k} a (known) sequence of m vectors (called a
deterministic input sequence), and and are
respectively, (unknown) system and observation noise sequences, with
known statistical information such as mean, variance, and covariance.
The Kalman filter assumes the following:
- and are assumed to be sequences of zero-mean Gaussian (or normal) white noise. That is, and , where δkl is a Kronecker Delta function, and Qk and Rk are positive definite matrices, E(u) is an expectation of random variable u.
- The initial state x0 is also assumed to be independent of and , that is .
The representation means the estimate of x at time instant k using all the data measured till the time instant j.
The Kalman filter algorithm can be summarized as shown in the below equations:
Initialization
Time Update / Predict
Measurement Update/Correction
Where Pk,j is an estimate error covariance nxn matrix, Gk is Kalman gain nxq matrix, and k=1, 2,..
Computation Strategy
The numerical accuracy of the Kalman filter covariance measurement update is a concern for implementation, since it differentiates two positive definite arrays. This is a potential problem if finite precision is used for computation. This design uses UDU factorization of P to address the numerical accuracy/stability problems.
Example for Kalman Filter
//Control Flag
INIT_EN = 1; TIMEUPDATE_EN = 2; MEASUPDATE_EN = 4;
XOUT_EN_TU = 8; UDOUT_EN_TU = 16; XOUT_EN_MU = 32;
UDOUT_EN_MU = 64; EKF_MEM_OPT = 128;
//Load A_mat,B_mat,Uq_mat,Dq_mat,H_mat,X0_mat,U0_mat,D0_mat,R_mat
//Initialization
KalmanFilter(A_mat, B_mat, Uq_mat, Dq_mat, H_mat, X0_mat, U0_mat, D0_mat, R_mat, u_mat, y_mat, Xout_mat, Uout_mat, Dout_mat, INIT_EN);
for(int iteration=0; iteration< count; iteration++)
{
//Load u_mat (control input)
for(int index=0; index <C_CTRL; index ++)
u_mat.write_float(index, control_input[index]);
//Time Update
KalmanFilter(A_mat, B_mat, Uq_mat, Dq_mat, H_mat, X0_mat, U0_mat, D0_mat, R_mat, u_mat, y_mat, Xout_mat, Uout_mat, Dout_mat, TIMEUPDATE_EN + XOUT_EN_TU + UDOUT_EN_TU);
//Load y_mat (measurement vector)
for(int index =0; index <M_MEAS; index ++)
y_mat.write_float(index, control_input[index]);
//Measurement Update
KalmanFilter(A_mat, B_mat, Uq_mat, Dq_mat, H_mat, X0_mat, U0_mat, D0_mat, R_mat, u_mat, y_mat, Xout_mat, Uout_mat, Dout_mat, MEASUPDATE_EN + XOUT_EN_MU + UDOUT_EN_MU);
}
API Syntax
template<int N_STATE, int M_MEAS, int C_CTRL, Int MTU, int MMU, bool USE_URAM=0, bool EKF_EN=0, int TYPE, int NPC, int XFCVDEPTH_A = _XFCVDEPTH_DEFAULT, int XFCVDEPTH_B = _XFCVDEPTH_DEFAULT, int XFCVDEPTH_UQ = _XFCVDEPTH_DEFAULT, int XFCVDEPTH_DQ = _XFCVDEPTH_DEFAULT, int XFCVDEPTH_H = _XFCVDEPTH_DEFAULT, int XFCVDEPTH_X0 = _XFCVDEPTH_DEFAULT, int XFCVDEPTH_U0 = _XFCVDEPTH_DEFAULT, int XFCVDEPTH_D0 = _XFCVDEPTH_DEFAULT, int XFCVDEPTH_R = _XFCVDEPTH_DEFAULT, int XFCVDEPTH_U = _XFCVDEPTH_DEFAULT, int XFCVDEPTH_Y = _XFCVDEPTH_DEFAULT, int XFCVDEPTH_XOUT = _XFCVDEPTH_DEFAULT, int XFCVDEPTH_UOUT = _XFCVDEPTH_DEFAULT, int XFCVDEPTH_DOUT = _XFCVDEPTH_DEFAULT>
void KalmanFilter ( xf::cv::Mat<TYPE, N_STATE, N_STATE, NPC, XFCVDEPTH_A> & A_mat,
#if KF_C!=0
xf::cv::Mat<TYPE, N_STATE, C_CTRL, NPC, XFCVDEPTH_B> & B_mat,
#endif
xf::cv::Mat<TYPE, N_STATE, N_STATE, NPC, XFCVDEPTH_UQ> & Uq_mat,
xf::cv::Mat<TYPE, N_STATE, 1, NPC, XFCVDEPTH_DQ> & Dq_mat,
xf::cv::Mat<TYPE, M_MEAS, N_STATE, NPC, XFCVDEPTH_H> & H_mat,
xf::cv::Mat<TYPE, N_STATE, 1, NPC, XFCVDEPTH_X0> & X0_mat,
xf::cv::Mat<TYPE, N_STATE, N_STATE, NPC, XFCVDEPTH_U0> & U0_mat,
xf::cv::Mat<TYPE, N_STATE, 1, NPC, XFCVDEPTH_D0> & D0_mat,
xf::cv::Mat<TYPE, M_MEAS, 1, NPC, XFCVDEPTH_R> & R_mat,
#if KF_C!=0
xf::cv::Mat<TYPE, C_CTRL, 1, NPC, XFCVDEPTH_U> & u_mat,
#endif
xf::cv::Mat<TYPE, M_MEAS, 1, NPC, XFCVDEPTH_Y> & y_mat,
xf::cv::Mat<TYPE, N_STATE, 1, NPC, XFCVDEPTH_XOUT> & Xout_mat,
xf::cv::Mat<TYPE, N_STATE, N_STATE, NPC, XFCVDEPTH_UOUT> & Uout_mat,
xf::cv::Mat<TYPE, N_STATE, 1, NPC, XFCVDEPTH_DOUT> & Dout_mat,
unsigned char flag)
Parameter Descriptions
Table . Kalman Filter Parameter Description
Parameter Used (?) or Unused (X) Description Initialization Time Update Measurement Update N_STATE ? ? ? Number of state variable; possible options are 1 to 128 M_MEAS ? ? ? Number of measurement variable; possible options are 1 to 128; M_MEAS must be less than or equal to N_STATE. In case of Extended Kalman Filter(EKF), M_MEAS should be 1. C_CTRL ? ? ? Number of control variable; possible options are 0 to 128; C_CTRL must be less than or equal to N_STATE. In case of EKF, C_CTRL should be 1. MTU ? ? ? Number of multipliers used in time update; possible options are 1 to 128; MTU must be less than or equal to N_STATE MMU ? ? ? Number of multipliers used in Measurement update; possible options are 1 to 128; MMU must be less than or equal to N_STATE USE_URAM ? ? ? URAM enable; possible options are 0 and 1 EKF_EN ? ? ? Extended Kalman Filter Enable; possible options are 0 and 1 TYPE ? ? ? Type of input pixel. Currently, only XF_32FC1 is supported. NPC ? ? ? Number of pixels to be processed per cycle; possible option is XF_NPPC1 (NOT relevant for this function) A_mat ? X X Transition matrix A. In case of EKF, Jacobian Matrix F is mapped to A_mat. B_mat ? X X Control matrix B. In case of KF, B_mat argument is not required when C_CTRL=0. And in case of EKF, Dummy matrix with size (N_STATE x 1) is mapped to B_mat. Uq_mat ? X X U matrix for Process noise covariance matrix Q Dq_mat ? X X D matrix for Process noise covariance matrix Q(only diagonal elements) H_mat ? X X Measurement Matrix H. In case of EKF, Jacobian Matrix H is mapped to H_mat. X0_mat ? X X Initial state matrix. In case of EKF, state transition function f is mapped to X0_mat. U0_mat ? X X U matrix for initial error estimate covariance matrix P D0_mat ? X X D matrix for initial error estimate covariance matrix P(only diagonal elements) R_mat ? X X Measurement noise covariance matrix R(only diagonal elements). In case of EKF, input only one value of R since M_MEAS=1. u_mat X ? X Control input vector. In case of KF, u_mat argument is not required when C_CTRL=0. And in case of EKF, observation function h is mapped to u_mat. y_mat X X ? Measurement vector. In case of EKF, input only one measurement since M_MEAS=1. Xout_mat X ? ? Output state matrix Uout_mat X ? ? U matrix for output error estimate covariance matrix P Dout_mat X ? ? D matrix for output error estimate covariance matrix P(only diagonal elements) flag ? ? ? Control flag register All U, D counterparts of all initialized matrices (Q and P) are obtained using U-D factorization
Flag bit | Description |
---|---|
0 | Initialization enable |
1 | Time update enable |
2 | Measurement update enable |
3 | Xout enable for time update |
4 | Uout/D:sub:out enable for time update |
5 | Xout enables for measurement update |
6 | Uout/D:sub:out enable for measurement update |
7 | Read optimization (Uq_mat, Dq_mat, U0_mat, D0_mat and R_mat) for Extended Kalman Filter |
Resource Utilization
The following table summarizes the resource utilization of the kernel in different configurations, generated using Vivado HLS 2019.1 tool for the Xilinx Xczu9eg-ffvb1156-1 FPGA.
Name | Resource Utilization | ||
---|---|---|---|
N_STATE=128; C_CTRL=128; M_MEAS=128; MTU=24; MMU=24 | N_STATE=64; C_CTRL=64; M_MEAS=12;MTU=16;MMU=16 | N_STATE=5; C_CTRL=1; M_MEAS=3;MTU=2;MMU=2 | |
300 MHz | 300 MHz | 300 MHz | |
BRAM_18K | 387 | 142 | 24 |
DSP48E | 896 | 548 | 87 |
FF | 208084 | 128262 | 34887 |
LUT | 113556 | 70942 | 18141 |
The following table shows the resource utilization of the kernel for a configuration with USE_URAM enable, generated using Vivado HLS 2019.1 for the Xilinx xczu7ev-ffvc1156-2-e FPGA.
Resource | Resource Utilization (N_STATE=64; C_CTRL=64; M_MEAS=12; MTU=4; MMU=4) (300 MHz) (ms) |
---|---|
BRAM_18K | 30 |
DSP48E | 284 |
FF | 99210 |
LUT | 53939 |
URAM | 11 |
Performance Estimate
The following table shows the performance of kernel for different configurations, as generated using Vivado HLS 2019.1 tool for the Xilinx® Xczu9eg-ffvb1156-1, for one iteration. Latency estimate is calculated by taking average latency of 100 iteration.
Operating Mode | Latency Estimate | |
---|---|---|
Operating Frequency (MHz) | Latency (ms) | |
N_STATE=128; C_CTRL=128; M_MEAS=128; MTU=24; MMU=24 | 300 | 0.7 |
N_STATE=64; C_CTRL=64; M_MEAS=12; MTU=16; MMU=16 | 300 | 0.12 |
N_STATE=5; C_CTRL=1; M_MEAS=3; MTU=2; MMU=2 | 300 | 0.04 |
The following table shows the performance of kernel for a configuration with UltraRAM enable, as generated using Vivado HLS 2019.1 tool for the Xilinx xczu7ev-ffvc1156-2-e, for one iteration. Latency estimate is calculated by taking average latency of 100 iteration.
Operating Mode | Operating Frequency (MHz) | Latency (ms) |
---|---|---|
N_STATE=64; C_CTRL=64; M_MEAS=12;MTU=4;MMU= 4 | 300 | 0.25 |