Marginal Likelihood Computation#

Recall Lemma 2#

Under same HMM setup, we are interested in marginal \(p(y)\):

\[\begin{align*} x &\sim \mathcal{N}(m_0, P_0) \\ y \mid x &\sim \mathcal{N}(A x + b, Q) \end{align*}\]

Then

\[\begin{align*} p(y) &= \int p(x) \; p(y|x) dx \\ &= \mathcal{N}(A m_0 + b, Q + APA^\top) \end{align*}\]

Marginal Likelihood form#

Consider the HMM:

\[\begin{align*} \pi_0(x_0) &= \mathcal{N}(x_0; m_0, P_0) \\ \tau(x_t | x_{t-1}) &= \mathcal{N}(x_t; A x_{t-1}, Q) \\ g(y_t | x_t) &= \mathcal{N}(y_t; H x_t, R) \end{align*}\]

We know the exact marginal likelihood \(p(y_{1:T})\) is given by

\[ p(y_{1:T}) = \int p(x_{0:T}, y_{1:T}) dx_{0:T} \]
\[ = \int \pi_0(x_0) \prod_{t=1}^T \bigg(\tau(x_t | x_{t-1}) g(y_t | x_t) \bigg) dx_{0:T} \]

We also note that equivalently

\[ p(y_{1:T}) = \prod_{t=1}^T p(y_t \mid y_{1:t-1}) \]

Where

\[ p(y_t \mid y_{1:t-1}) = \int p(y_t \mid x_t) p(x_t \mid y_{1:t-1}) \, dx_t \]

Since we have a fully Gaussian HMM setup, we can use Kalman filter to compute \(p(x_t \mid y_{1:t-1}) = \mathcal{N}(x_t; \hat{m}_t, \hat{P}_t)\), where

\[\begin{align*} \hat{m}_t &= A m_{t-1} \\ \hat{P}_t &= A P_{t-1} A^\top + Q \end{align*}\]

So that we have

\[ p(y_t \mid y_{1:t-1}) = \int \mathcal{N}(y_t; H x_t, R) \cdot \mathcal{N}(x_t; \hat{m}_t, \hat{P}_t) \, dx_t \]

By Lemma 2, this is

\[ p(y_t \mid y_{1:t-1}) = \mathcal{N}(H \hat{m}_t , H \hat{P}_t H^\top + R) = \mathcal{N}(H \hat{m}_t , S_t) \]

Where we let \(S_t = H \hat{P}_t H^\top + R\) for simplicity

Kalman for marginal likelihood#

The full algorithm for computing \(\log p(y_{1:T}) = \sum_{t=1}^T \log p(y_t \mid y_{1:t-1})\) is given by

  • Input: Starting point \( m_0, P_0\), and the sequence of observations \( y_{1:T} \) for the specific T.

    Set \(\hat{m}_0 = m_0, \hat{P}_0 = P_0\)

  • Filtering:
    For \( n = 1, \dots, T \) do

    • Prediction step:

    \[\begin{align*} \hat{m}_t &= \theta m_{t-1} \\ \hat{P}_t &= \theta P_{t-1} \theta^\top + Q \end{align*}\]
    • Update step:

    \[\begin{align*} S_t &= H \hat{P}_t H^\top + R \\ K_t &= \hat{P}_t H^\top (S_t)^{-1} \\ m_t &= \hat{m}_t + K_t (y_t - H \hat{m}_t) \\ P_t &= (I - K_t H) \hat{P}_t \end{align*}\]

    End for

  • Return \( \hat{m}_{1:T}, S_{1:T}\)

And we output

\[ \log p(y_{1:T}) = \sum_{t=1}^T \log \mathcal{N}(y_t; H \hat{m}_{t}, S_t) \]

Marginal Likelihood using Bootstrap Particle Filter#