MCMC Sampler Preview (Metropolis)
Markov Chain Monte Carlo (MCMC) is a family of methods for drawing samples from a probability distribution
when direct sampling is difficult. Instead of sampling independent points, MCMC constructs a
Markov chain whose long-run behavior matches the target distribution.
What problem does MCMC solve?
Suppose you have a target density \(f(x)\) on the real line (often a Bayesian posterior),
but you can’t easily draw i.i.d. samples from it. If you can evaluate \(f(x)\) (even up to a proportionality constant),
MCMC can still generate a sequence \(x_0, x_1, \dots\) that eventually behaves like draws from \(f\).
A key advantage is that many MCMC methods only require the unnormalized target:
if \(f(x) = c\,\tilde{f}(x)\) for unknown \(c\gt 0\), the constant cancels in the acceptance ratio.
Random-walk Metropolis (the algorithm used here)
This preview uses a 1D random-walk Metropolis sampler. From the current state \(x\), propose a candidate \(y\) by:
\[
y = x + sZ,\qquad Z\sim \mathcal{N}(0,1),
\]
where \(s\gt 0\) is the proposal step size. Because the proposal is symmetric
(\(q(y\mid x)=q(x\mid y)\)), the acceptance probability simplifies to:
\[
\alpha(x,y)=\min\!\left(1,\frac{f(y)}{f(x)}\right).
\]
Then:
- Accept: set \(x_{\text{new}}=y\) with probability \(\alpha(x,y)\).
- Reject: keep \(x_{\text{new}}=x\) otherwise (the chain stays put).
Why logs are used in implementations
Densities can be tiny, so implementations usually compare logs:
\[
\log\alpha = \log f(y) - \log f(x),\qquad
\text{accept if }\log U \lt \log\alpha,\ U\sim \text{Unif}(0,1).
\]
Burn-in and Monte Carlo estimates
Early states may depend strongly on the starting value \(x_0\). A common practice is to discard
the first \(B\) samples as burn-in. Using the remaining \(M=N-B\) states, estimate:
\[
\widehat{\mu}=\frac{1}{M}\sum_{t=B}^{N-1} x_t,\qquad
\widehat{\sigma}^2=\frac{1}{M-1}\sum_{t=B}^{N-1}(x_t-\widehat{\mu})^2.
\]
The tool also reports quantiles (median and a 95% interval) computed from the retained samples.
Acceptance rate: what it tells you
The acceptance rate is the fraction of proposals that were accepted. It is a practical tuning signal:
- Too low acceptance (near 0): steps are too large, proposals land in low-density regions.
- Too high acceptance (near 1): steps are too small, the chain moves slowly and mixes poorly.
In 1D random-walk Metropolis, many targets work well with acceptance somewhere around 0.2–0.6
(a rule-of-thumb, not a guarantee).
How to read the plots
- Trace plot (left): shows \(x_t\) versus iteration \(t\). A stable-looking trace (after burn-in) suggests stationarity.
- Histogram (right): shows the empirical distribution of samples after burn-in.
- Target curve: overlaid for a visual check (scaled to match histogram height for readability).
University notes: mixing, autocorrelation, and extensions
MCMC samples are typically correlated. Even with large \(N\), the effective number of independent draws can be much smaller.
More advanced diagnostics include autocorrelation functions, effective sample size (ESS), and multiple-chain convergence checks.
Beyond Metropolis, common advanced MCMC methods include Gibbs sampling, Hamiltonian Monte Carlo (HMC),
and other gradient-based samplers that can mix faster in higher dimensions.