[TM] Stein’s lemma

This post is about Stein’s lemma, which first appeared in the landmark paper of Stein in 1981. This lemma is leads to Stein’s unbiased risk estimator and is useful for proving central limit theorems. It has also been called `Gaussian integration by parts’, which is in fact a high level description of the proof.

Lemma 1 (Stein’s lemma) If {y} follows the standard normal distribution, then

\displaystyle  \mathbb{E}\left[x f(x) \right] = \mathbb{E}\left[f'(x)\right],

if the expectations are well-defined.

Proof: If {f} is `nice’, then

\displaystyle  \begin{array}{rcl}  \mathbb{E}\left[f'(x)\right] &=& \int_{-\infty}^{\infty} f'(x) \frac{\exp(-x^2/2)}{\sqrt{2\pi}} dx\\ &=& 0 - \int_{-\infty}^{\infty} f(x) \left(-x \frac{\exp(-x^2/2)}{\sqrt{2\pi}} \right) dx \\ &=& \mathbb{E}\left[x f(x)\right]. \end{array}


It is also convinient to denote {\phi(x)} as the standard Normal density and remember that

\displaystyle  \phi'(x) = -x \phi(x).

Stein’s lemma can be generalized to exponential family distributions. In particular, for multivariate normals, if {y \sim \mathrm{NVM}_{p}(\mu, \sigma^2 I)} and {\hat{\mu} = r(y)} is any differentiable estimator, then we have

\displaystyle  \textrm{cov}\left(\hat{\mu}_i, y_i\right) = \sigma^2 \mathbb{E}\left[\frac{\partial \hat{\mu}_i}{\partial y_i}\right].

This is Equation (12.56) in `Computer Age Statistical Inference’ by Efron and Hastie.

This post is the second in the category `trivial matters’, where I formally write down some notes to myself about identities, tricks, and facts that I repeatedly use but (unfortunately) need to re-derive everytime I use them. Although these posts are short, they discuss important topics. The term `trivial matters’ is used as a sarcasm, because my blood boils everytime I see terms like `obviously’ or `it is trivial to show that …’ when I grade students’ homeworks.