Hello 2021! Here is the first post in the Sunday Reading Notes series in the year of 2021. This week, I keep reading about optimal scaling of Markov chain Monte Carlo algorithms and I move on to a new class of algorithms: MALA.

Roberts and Rosenthal (1998, JRSSB) concerns the optimal scaling of Metropolis-Adjusted Langevin algorithm (MALA) for target distributions of the form . Unlike random walk Metropolis algorithm, in MALA, the proposal distribution uses the gradient of at :

Here is the step variance and is a d-dimensional standard normal. Following this proposal, we perform an accept / reject step, so that the resulting Markov chain has the desired stationary distribution:

There are some smoothness assumptions on the log-density : absolute value of the first 8 order derivatives are bounded, is Lipschitz, and all moments exist for .

**Theorem 3 (diffusion limit of first component)**

Consider the chain starting from stationarity and following the MALA algorithm with variance for some . Let be a Poisson process with rate , and is the first component of , i.e. . We must have, as , the process , where is a Langevin diffusion defined by

Here is the speed measure and .

The skeleton of the proof is similar to that of proving diffusion limit of random walk Metropolis. We will find sets such that and on these sets the generators converges uniformly.

**Theorem 4 (convergence of acceptance rate)**

where

As a result, when is maximized, the acceptance rate . Also the step variance should be tuned to be

MALA enjoys a faster convergence rate of order compared to order of RWM. But let’s also keep in mind that MALA requires an extra step to compute the gradients , which takes steps to compute.

Reference:

Roberts, G. O., & Rosenthal, J. S. (1998). Optimal scaling of discrete approximations to Langevin diffusions. *Journal of the Royal Statistical Society: Series B (Statistical Methodology)*, *60*(1), 255-268.