SRN – Informed proposals for local MCMC in discrete spaces by Zanella (Part I)

This week I am reading ‘Informed proposals for local MCMC in discrete spaces‘ by Giacomo Zanella. This paper is about designing MCMC algorithms for discrete-values high-dimensional parameters, and the goal is similar to the papers discussed in previous posts (Hamming ball sampler & auxiliary-variable HMC). I decide to split the Sunday Reading Notes on this paper into two parts, because I find many interesting ideas in this paper.

In this paper, Zanella come up with locally-balanced proposals. Suppose \pi(x) is the target density and K_{\sigma}(x,dy) is an uninformed proposal. We assume that as \sigma \to 0 the kernel K_{\sigma}(x,dy) converges to the delta measure. Zanella seeks to modify this uninformed proposal so that it incorporates information about the target \pi and is biased towards areas with higher density. An example of locally-balanced proposals is Q_{\sqrt{\pi}} (x,dy) = \frac{\sqrt{\pi(y) }K_{\sigma}(x,dy)}{(\sqrt{\pi} * K_{\sigma})(x)}. This kernel is reversible with respect to \sqrt{\pi(x)}(\sqrt{\pi} * K_{\sigma})(x), which converges to \pi(x)dx as x \to 0. [Note the normalizing constatn is the convolution \sqrt{\pi(x)}* K_{\sigma} = \int \sqrt{\pi(y)} K_{\sigma}(x,dy)].]

More generally, Zanella considers a class of pointwise informed proposals that has the structure Q_{g,\sigma} = \frac{1}{Z_{g}}\cdot g\left(\frac{\pi(y)}{\pi(x)}\right) K_{\sigma}(x,dy). It is suggested that the function g satisfy g(t) = t g(1/t).

I will save the discussion on locally-balanced proposals and Peskun optimality to Part II. In this part, I want to discuss Section 5: Connection to MALA and gradient-based MCMC. In continuous space, the point-wise informed proposal Q_{g,\sigma} would be infeasible to sample from because of the term g\left(\frac{\pi(y)}{\pi(x)}\right) . If we take a first-order Taylor expansion, we would have Q_{g,\sigma}^{(1)} \propto g \left( \exp ( \nabla  \log \pi(x) (y-x)) \right) K_{\sigma}(x,dy). If we choose g(t) = \sqrt{t} and K_{\sigma}(x,\cdot) =N(x,\sigma^2), this is the MALA proposal.

I find this connection very interesting, although I do not have a good intuition about where this connection comes from. One way to explain it is that gradient-based MCMC in continuous space is using local information to design informed proposals. In the conclusions, the author mentions that this connection should improve robustness of gradient-based MCMC schemes and help with parameter tuning.

References:(x)

  •  Zanella, G. (2017). Informed proposals for local MCMC in discrete spaces. arXiv preprint arXiv:1711.07424.

Author: PhyllisWithData

Statistics PhD student at Harvard University.

2 thoughts on “SRN – Informed proposals for local MCMC in discrete spaces by Zanella (Part I)”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: