HMM Intro#
THis is an intro to Gaussian Hidden Markov Model
Lemma#
We restate two basic useful lemmas here.
Lemma 1 Posterior update#
Then
Where
Equivalently
Lemma 1 under HMM setup#
Consider the HMM setup:
say we are interested in \(p(x_t | x_{t-1}, y_t)\), which is the optimal proposal for particle filter:
First way (shorter):
Second way:
This is an example implementation:
import numpy as np
def method1(A, Q, H, R, X_t_minus_1, Y):
"""
Computes tilde_P and tilde_m_t using Lemma 3.1:
tilde_P = Q - Q H^T (R + HQH^T)^{-1} HQ
tilde_m_t = AX_{t-1} + Q H^T (R + HQH^T)^{-1}(Y - HAX_{t-1})
"""
HQ = H @ Q
S = R + HQ @ H.T
S_inv = np.linalg.inv(S)
tilde_P = Q - Q @ H.T @ S_inv @ HQ
AX = X_t_minus_1 @ A.T # (N, d)
HAX = AX @ H.T # (N, m)
tilde_m = AX + (Y - HAX) @ (Q @ H.T @ S_inv).T # (N, d)
return tilde_P, tilde_m
def method2(A, Q, H, R, X_t_minus_1, Y):
"""
tilde_P = (Q^{-1} + H^T R^{-1} H)^{-1}
tilde_m_t = tilde_P (Q^{-1} A x_{t-1} + H^T R^{-1} y_t)
"""
Q_inv = np.linalg.inv(Q)
R_inv = np.linalg.inv(R)
tilde_P_inv = Q_inv + H.T @ R_inv @ H
tilde_P = np.linalg.inv(tilde_P_inv)
AX = X_t_minus_1 @ A.T # (N, d)
tilde_m = (AX @ Q_inv.T + Y @ R_inv @ H) @ tilde_P.T # (N, d)
return tilde_P, tilde_m
The two methods should agree except for ill-conditioned parameters.
We check if they agree on this setup:
print('checking if two methods agree on random input')
I2 = np.eye(2)
I4 = np.eye(4)
kappa=0.1
r = 0.1
A = np.block([
[I2, kappa * I2],
[np.zeros((2, 2)), 0.99 * I2]
])
Q = np.block([
[(kappa**3 / 3) * I2, (kappa**2 / 2) * I2],
[(kappa**2 / 2) * I2, kappa * I2]
])
H = I4
R = r * I4
X_t_minus_1, Y = np.random.randn(1, 4), np.random.randn(4)
output1 = method1(A, Q, H, R, X_t_minus_1, Y)
output2 = method2(A, Q, H, R, X_t_minus_1, Y)
print(output1)
print(output2)
checking if two methods agree on random input
(array([[0.0002079 , 0. , 0.0024948 , 0. ],
[0. , 0.0002079 , 0. , 0.0024948 ],
[0.0024948 , 0. , 0.04993763, 0. ],
[0. , 0.0024948 , 0. , 0.04993763]]), array([[ 0.93561215, -2.2784819 , -0.85038407, -0.315009 ]]))
(array([[0.0002079 , 0. , 0.0024948 , 0. ],
[0. , 0.0002079 , 0. , 0.0024948 ],
[0.0024948 , 0. , 0.04993763, 0. ],
[0. , 0.0024948 , 0. , 0.04993763]]), array([[ 0.93561215, -2.2784819 , -0.85038407, -0.315009 ]]))
Lemma 2 Marginal Likelihood#
Under same HMM setup, we are interested in marginal \(p(y)\):
Then
This is useful for marginal likelihood update.
Conclusion#
With those 2 lemmas, we are equiped to derive the Kalman filter. Again consider HMM setup
At each time step \(t\), the Kalman filter computes 2 distributions.
Given \(y_{1:t-1}\), one obtains the predictive distribution \(p(x_t \mid y_{1:t-1}) = \mathcal{N}(x_t; \hat{m}_t, \hat{P}_t)\), then using the information \(y_t\), one performs update and obtains the posterior \(p(x_t \mid y_{1:t}) = \mathcal{N}(x_t; m_t, P_t)\)
Assuming \(x_{t-1} \mid y_{1:t-1} \sim \mathcal{N}(m_{t-1}, P_{t-1})\), then using Lemma 2 we have
So we set
This is our prediction / filtering step.
Now given \(x_{t} \mid y_{1:t-1} \sim \mathcal{N}(\hat{m}_t, \hat{P}_t)\), we update with \(y_t\) to obtain \(p(x_t \mid y_{1:t}) = \mathcal{N}(x_t; m_t, P_t)\)
Using Lemma 1 we have
Where
Or equivalently