Sequential Monte Carlo

A general framework for modeling nonlinear state-space models

08 February 2025

Sequential Monte Carlo (SMC) is a general extension of particle filtering, which provides a flexible sampling framework for computing the posterior distribution of nonlinear state-space models.

Nonlinear state space model

Suppose we have a nonlinear state space model

\[\begin{align} \mathrm{\mathbf{x}_t|\mathbf{x}_{t-1}}&\sim \mathrm{p(\cdot|\mathbf{x}_{t-1}, \theta)},\notag \\ \mathrm{\mathbf{y}_t|\mathbf{x}_t}&\sim \mathrm{p(\cdot|\mathbf{x}_t, \theta)},\notag \end{align}\]

where $\mathrm{\theta}$ denotes the hyperparameter of the dynamics. We are interested in the inference of the model to understand the dynamics. To that end, we compute the marginal likelihood of the model

\[\begin{align} \mathrm{p(\mathbf{y}_{1:T}|\theta)=p(\mathbf{y}_{1},\theta) \prod_{t=2}^T p(\mathbf{y}_{t}|\mathbf{y}_{1:t-1},\theta)},\label{marginal_pdf} \end{align}\]

where each sub-component of the marginal likelihood follows that

\[\begin{align} \mathrm{p(\mathbf{y}_{t}|\mathbf{y}_{1:t-1},\theta)}&=\mathrm{\int p(\mathbf{y}_{t}, \mathbf{x}_{t}|\mathbf{y}_{1:t-1},\theta) d \mathbf{x}_{t}},\notag \\ &=\mathrm{\int p(\mathbf{y}_{t} | \mathbf{x}_{t},\theta) p(\mathbf{x}_{t} | \mathbf{y}_{1:t-1},\theta)d \mathbf{x}_{t}},\notag \\ % &=\mathrm{\int p(\mathbf{y}_{t} | \mathbf{x}_{t},\theta) p(\mathbf{x}_t|x_{t-1}, \theta) p(\mathbf{x}_{t-1}|\mathbf{y}_{1:t-1}, \theta) d \mathbf{x}_{t-1:t}},\notag \\ &\approx \mathrm{\frac{1}{N}\sum_{i=1}^N \underbrace{p(\mathbf{y}_{t} | \mathbf{x}^{(i)}_{t},\theta)}_{w_t^{(i)}}},\label{marginal_approximation} \\ % &\approx \mathrm{\sum_{i=1}^N p(\mathbf{y}_{t} | \mathbf{x}^{(i)}_{t},\theta) p(\mathbf{x}^{(i)}_{t} | \mathbf{y}_{1:t-1},\theta)}.\notag \\ % &\approx \mathrm{\lim_{N\rightarrow \infty}\sum_{i=1}^N p(\mathbf{y}_{t} | \mathbf{x}^{(i)}_{t},\theta) p(\mathbf{x}^{(i)}_{t} | \mathbf{y}_{1:t},\theta)}.\notag \\ \end{align}\]

where $\mathrm{\mathbf{x}^{(i)}_{t}\sim p(\mathbf{x}_{t} \\| \mathbf{y}_{1:t-1},\theta)}$. To simulate the desired set of particles $\{\mathrm{\mathbf{x}_{t}^{(i)}}\}_{i=1}^N$, we observe that

\[\begin{align} \mathrm{p(\mathbf{x}_{t} | \mathbf{y}_{1:t-1},\theta)}&=\mathrm{\int p(\mathbf{x}_t|\mathbf{x}_{t-1}, \theta) p(\mathbf{x}_{t-1}|\mathbf{y}_{1:t-1}, \theta) d \mathbf{x}_{t-1}}\notag \\ &=\mathrm{\int p(\mathbf{x}_t|\mathbf{x}_{t-1}, \theta) \frac{p(\mathbf{y}_{t-1}|\mathbf{x}_{t-1}, \theta) p(\mathbf{x}_{t-1}|\mathbf{y}_{1:t-2}, \theta)}{p(\mathbf{y}_{t-1}|\mathbf{y}_{1:t-2}, \theta)} d \mathbf{x_{t-1}}}\notag.\\ &\propto \mathrm{\int p(\mathbf{x}_t|\mathbf{x}_{t-1}, \theta) p(\mathbf{y}_{t-1}|\mathbf{x}_{t-1}, \theta) p(\mathbf{x}_{t-1}|\mathbf{y}_{1:t-2}, \theta) d \mathbf{x}_{t-1}}\label{latent_simulate}.\\ % &=\mathrm{\int p(x_t|x_{t-1}, \theta) \frac{p(y_{t-1}|x_{t-1}, \theta) w_{t-2}}{p(y_{t-1}|y_{1:t-2}, \theta)} d x_{t-1}}\notag.\\ \end{align}\]

Suppose a set of particles $\{\mathrm{\mathbf{x}_{t-1}^{(i)}}\}_{i=1}^N$, where $\mathrm{\mathbf{x}^{(i)}_{t-1}\sim p(\mathbf{x}_{t-1} \\| \mathbf{y}_{1:t-2},\theta)}$, is available. The above relation inspires to us to design an algorithm as follows

Algorithm

Sample weights

Given the particles $\{\mathrm{\mathbf{x}_{t-1}^{(i)}}\}_{i=1}^N$, compute the sample weight for resampling and likelihood estimation

\[\begin{align} \mathrm{w_{t-1}^{(i)}\propto p(\mathbf{y}_{t-1} | \mathbf{x}^{(i)}_{t-1},\theta)}, \text{ where } i=\{1,2,\cdots, N\}.\notag \\ \end{align}\]

Likelihood estimation

Approximate the likelihood at time $t-1$ following \eqref{marginal_approximation}, where a more refined version regarding the log likelihood is presented in (Andrieu et al., 2004)

\[\begin{align} \mathrm{p(\mathbf{y}_{t-1}|\mathbf{y}_{1:t-2},\theta)}&\approx \mathrm{\frac{1}{N}\sum_{i=1}^N w_{t-1}^{(i)} }.\label{unbiased_conditional_Z} \\ \end{align}\]

Resampling

Normalize $\mathrm{w_{t-1}^{(i)}}$ into $\mathrm{\overline w_{t-1}^{(i)}}\propto \mathrm{w_{t-1}^{(i)}} \propto \mathrm{p(\mathbf{y}_{t-1} \\| \mathbf{x}^{(i)}_{t-1},\theta)}$, s.t. $\mathrm{\sum_{i=1}^N\overline w_{t-1}^{(i)}=1}$.

Resample the particles $\{\mathrm{\mathbf{x}_{t-1}^{(i)}}\}_{i=1}^N$ according to the probability $\mathrm{\overline w_{t-1}^{(i)}}$.

Propagation

Draw a set of new particles following the dynamics

\[\begin{align} \mathrm{\mathbf{x}_t^{(i)}}&\sim \mathrm{p(\mathbf{x}_{t}|\mathbf{x}^{(i)}_{t-1}, \theta)}, \text{ where } i=\{1,2,\cdots, N\}. \notag \\ \end{align}\]

The algorithm described above is the bootstrap particle filter, which employs a proposal distribution identical to the propagation step and performs resampling at every step. To simulate the desired particles following Eq. \eqref{latent_simulate}, we adopt a slightly different ordering:

$\begin{align} \cdots \rightarrow \underbrace{\text{sample weight}\rightarrow \text{resampling}}_{\mathrm{p(\mathbf{y}_{t-1} \\| \mathbf{x}^{(i)}_{t-1},\theta)}}\rightarrow \underbrace{\text{propagation}}_{\mathrm{p(\mathbf{x}^{(i)}_t|\mathbf{x}^{(i)}_{t-1}, \theta) }} \rightarrow \cdots \end{align}$ This ordering can be naturally adjusted to standard formulations, as studied in Section 16.3.5 (Särkkä, 2023) and (Alenlöv, 2019-08-26). Additionally, for simplicity, we do not explicitly introduce the concept of (sequential) importance sampling in this presentation. We will introduce proposal-optimized variates of SMC for future blogs.

Combining Eq.\eqref{marginal_pdf} and \eqref{unbiased_conditional_Z}, we observe the marginal likelihood can be estimated as follows

\[\begin{align} \mathrm{p(\mathbf{y}_{1:T}|\theta)\approx \widetilde Z=\prod_{t=1}^T \frac{1}{N}\sum_{i=1}^N w_{t-1}^{(i)}}. \notag \\ \end{align}\]

where the unbiased property of $\mathrm{\widetilde Z}$ has been studied in (Andrieu et al., 2010).

Stochastic volatility models in finance

Stochastic volatility (SV) models are popular in finance to model the dynamics of heteroskedastic financial assets to hedge risks (Creal, 2012). Consider a SV model (Doucet & Johansen, 2011)

\[\begin{align} \mathrm{g(y_t|x_t)} &\sim \mathrm{N(y_t|0, \beta^2 e^{x_t})} \notag \\ \mathrm{f(x_t|x_{t-1})} &\sim \mathrm{N(x_t|m x_{t-1}, \sigma^2)} \notag \\ \mathrm{p(x_1)} &\sim \mathrm{N\bigg(x_1 | 0, \frac{\sigma^2}{1-m^2}\bigg)}, \notag \end{align}\]

where $\mathrm{y_t}$ denotes the S&P 500 log-return data.

The following code snippet is an example on the SV model from (Kviman et al., 2024)

class SV:
    def __init__(self, sigma=1., beta=.1, phi=0.99):
        self.sigma = sigma
        self.beta = beta
        self.phi = phi

    def particle_0(self, N):
        return np.random.normal(0, self.sigma / np.sqrt(1 - self.phi ** 2), size=(N, 1))

    def propagate(self, x):
        x_next = self.phi * x.squeeze() + self.sigma * npr.normal(size=x.size)
        return x_next.reshape((-1, 1))

    def log_g(self, x, y):
        return stats.norm.logpdf(y, loc=0, scale=np.sqrt(self.beta**2 * np.exp(x))).squeeze()

The resampling schemes (Naesseth et al., 2019) include the standard multinomial resampling and stratified (or systematic) resampling for variance reduction.

def multinomial_resampling(ws, size=0):
    u = np.random.rand(*ws.shape)
    bins = np.cumsum(ws)
    return np.digitize(u, bins)


def stratified_resampling(ws, size=0):
    N = len(ws)
    u = (np.arange(N) + np.random.rand(N)) / N
    bins = np.cumsum(ws)
    return np.digitize(u, bins)

The marginal log likelihood estimation can be conducted as follows

def bootstrap_filter(model, y, N, T, d, resampling, update_weights):
    # Initialize storage arrays
    particles = np.zeros((N, d, T))
    normalized_weights = np.zeros((N, T))
    log_weights = np.zeros((N, T))
    
    # Initialize marginal log-likelihood
    marg_log_likelihood = 0.0

    # Initial particles and weights
    particles[..., 0] = model.particle_0(N)
    log_g_t = model.log_g(x=particles[:, :d_y, 0], y=y[0])
    normalized_weights[:, 0], log_weights[:, 0] = update_weights(log_weights[:, 0], log_g_t)
    
    marg_log_likelihood += logsumexp(log_weights[:, 0]) - np.log(N)

    for t in range(1, T):
        # Resampling at every step
        new_ancestors = resampling(normalized_weights[:, t - 1]).astype(int)
        normalized_weights[:, t - 1] = 1 / N  # Reset weights after resampling
        log_weights[:, t - 1] = 0

        # Propagate particles
        particles[:, :, t] = model.propagate(particles[new_ancestors, :, t - 1])

        # Compute weights
        log_g_t = model.log_g(particles[:, :d_y, t], y[t])
        normalized_weights[:, t], log_weights[:, t] = update_weights(log_weights[:, t - 1], log_g_t)

        # Update marginal log-likelihood
        marg_log_likelihood += logsumexp(log_weights[:, t]) - np.log(N)

    return particles, normalized_weights, log_weights, marg_log_likelihood

def update_weights(log_weights, log_g_t):
    log_weights += log_g_t
    log_w_tilde = log_weights - logsumexp(log_weights)
    normalized_weights = np.exp(log_w_tilde)
    return normalized_weights, log_weights

The simulations below effectively capture the clustering effect of volatility for the daily log-returns of S&P 500. We acknowledge that there are still limitations in this model, such as the inefficacy to tackle the long-term dependencies, heavy tails, or high dimensional problems, but this is a good start to understand nonlinear state space models.

Andrieu, C., Doucet, A., Singh, S. S., & Tadić, V. B. (2004). Particle Methods for Change Detection, System Identification, and Control. Proceedings of the IEEE, 92(3), 423–438.
Särkkä, S. (2023). Bayesian Filtering and Smoothing. Cambridge University Press.
Alenlöv, J. (2019-08-26). Sequential Monte Carlo methods. Lecture 4.
Andrieu, C., Doucet, A., & Holenstein, R. (2010). Particle Markov Chain Monte Carlo Methods.
Creal, D. (2012). A Survey of Sequential Monte Carlo Methods for Economics and Finance. Econometric Reviews.
Doucet, A., & Johansen, A. M. (2011). A Tutorial on Particle Filtering and Smoothing: Fifteen Years Later. In D. Crisan & B. Rozovskii (Eds.), The Oxford Handbook of Nonlinear Filtering (pp. 656–704). Oxford University Press.
Kviman, O., Branchini, N., Elvira, V., & Lagergren, J. (2024). Variational Resampling. AISTATS.
Naesseth, C. A., Lindsten, F., & Schön, T. B. (2019). Elements of Sequential Monte Carlo. Foundations and Trends® in Machine Learning.
Naesseth, C. A., Linderman, S. W., Ranganath, R., & Blei, D. M. (2018). Variational Sequential Monte Carlo. AISTATS.