# SRN – Bayesian Calibration of Microsimulation Models by Rutter et al.

The Sunday Reading Notes paper for this week is ‘Bayesian Calibration of Microsimulation Models’ by Carolyn Rutter, Diana Miglioretti and James Savarino. This is a 2009 JASA Applications and Case Studies paper.

According to a 2012 Review paper by Rutter et al,

Microsimulation models (MSMs) for health outcomes simulate individual event histories associated with key components of a disease process; these simulated life histories can be aggregated to estimate population-level effects of treatment on disease outcomes and the comparative effectiveness of treatments. Although MSMs are used to address a wide range of research questions, methodological improvements in MSM approaches have been slowed by the lack of communication among modelers. In addition, there are few resources to guide individuals who may wish to use MSM projections to inform decisions.

In this paper, the authors propose a Bayesian method to calibrate microsimulations models, using Markov Chain Monte Carlo. The case study in this paper is the history of colorectal cancer (CRC). In this paper, the authors assume all CRCs arise from an adenoma and the history of CRC consists of four components: 1) adenoma risk, 2) adenoma growth, 3) transition from adenoma to preclinical cancer and 4) transition from preclinical cancel to clinical cancer. These four components are not observed directly and the calibration data consists of prevalence of adenomas and preclinical cancers and the size and/or number of adenomas from many (independent) studies from different years and about different subpopulations and using different colonoscopy methods.

The authors use $\theta$ to denote MSM parameters and these parameters can be separated from $K$ components with independent priors: $\pi(\theta) = \prod_{i=1}^k \pi_i(\theta_i).$ The calibration data come from $N$ independent sources and each follows some distribution  $y_j|\theta \sim f_j(g_j(\theta))$ parametrized by some unknown functions of the MSM parameters. The likelihood is therefore $f(Y | \theta) = f(Y|g(\theta) )= \prod_{j=1}^n f_j(y_j|g_j(\theta))$.

What makes the calibration problem hard is the unknown function $g(\theta)$. Suppose we want to simulate from the posterior distribution of $\theta|Y$ using a Metropolis-Hastings(MH) algorithm, we need to know $r(\theta,\theta^*) = \frac{\pi(\theta)f(y|g(\theta)}{\pi(\theta^*)f(y|g(\theta^*)}$. But because $g$ is unknown, the MH step cannot be performed.

So the authors propose an approximate MH algorithm that includes a step to estimate $g(\theta)$ for both the current value $\theta$ and the proposed value $\theta^*$. To me this feels like an ‘EM’ step: simulate $m$ copies of $\tilde{y}_{ji}$ from $f_j(y_j| g_j(\theta))$ and calculate the MLE $\hat{g}_j(\theta).$ With this approximation, the resulting transition probability function $\hat{\alpha}(\theta,\theta^*)$for the Metropolis-within-Gibbs step on $\theta_i$ is based on $\frac{\pi(\theta_i)}{\pi(\theta_i^*)}\cdot \frac{\prod_{j=1}^N f_j(y_j|\hat{g}_j(\theta^*_i, \theta_{(-i)})}{\prod_{j=1}^n f_j(y_j|\hat{g}_j(\theta))}.$ In the Appendix, the authors prove that this approximation satisfies the detailed-balanced condition in the limit of $m$ goes to infinity.

I think this paper provides an interesting example of how to incorporate data from multiple sources. As the authors point out,

how closely the model should calibrate to observed data is unclear, especially when calibration data are variable and many provide conflicting interest. […] It depends on how modelers trade-off concerns about possibly overparameterizing and overfitting calibration data relative to the importance of exactly replicating observed or expected results.

References:

• Rutter, C. M., Miglioretti, D. L., & Savarino, J. E. (2009). Bayesian calibration of microsimulation models. Journal of the American Statistical Association104(488), 1338-1350.
• Rutter, C. M., Zaslavsky, A. M., & Feuer, E. J. (2011). Dynamic microsimulation models for health outcomes: a review. Medical Decision Making31(1), 10-18.