Returning to my Sunday Reading Notes series from Thanksgiving recess, I want to discuss the paper ‘Controlled Sequential Monte Carlo’ by Jeremy Heng, Adrian N. Bishop, George Deligiannidis and Arnaud Doucet. You can read Xi’an’s blog post discussing this paper.
As pointed out in Xi’an’s blog, this is a massive paper, and I have only managed to go through Section 3. This paper starts with introducing Feynman-Kac models and twisted Feynman-Kac models, which is motivated by designing observations dependent kernels for sequential Monte Carlo (SMC) methods. At the terminal time , the twisted model and original model are the same and at all times
the two models have the same normalizing constant which we try to approximate with SMC. With a current policy
, there is a policy refinement
that is a admissible policy and
is referred to as the optimal policy by the authors because of its connection with Kullback-Leibler optimal control problem.
The policy refinements can be found using backward recursion and
for
where
are Bellman operators. Because the policies
are intractable, the authors introduce logarithmic projection as a means of approximation. This is the Approximate Dynamic Programming (ADP) algorithm that the authors outlined in Algorithm 2. The final algorithm (Algorithm 3) alternates between performing a twisted SMC and ADP to find the next policy refinement.
References:
Heng, J., Bishop, A. N., Deligiannidis, G., & Doucet, A. (2017). Controlled Sequential Monte Carlo. arXiv preprint arXiv:1708.08396.