Approximate gradients for inference of partially-observed stochastic processes


There will be a drinks reception after the seminar in the ground floor social area.

Bayesian computation remains onerous at scale for inference under many discrete-valued stochastic process-based models, while these models remain ubiquitous across biology and public health. In this talk, we will explore how one can construct computationally efficient approximations to the gradient of the data likelihood under continuous-time Markov chain (CTMC) models with respect to their high-dimensional parameters. CTMCs underpin the most popular models for learning about how rapidly evolving pathogens change over time and space to give rise to human infection, and the dimensionality of these problems are daunting. With these approximations in hand, a new variant of Hamiltonian Monte Carlo (HMC) becomes tractable to explore the parameter posterior, and we bound the approximation error using several small tricks from matrix analysis. This new sampling approach enables the introduction of a novel random-effects CTMC model that captures biological realism previously missing. Applied to the analysis of early SARS-CoV-2 genomes, the random-effects remove bias in inference of the location and timing of the pathogen’s split-over into humans, while the approximate-gradient-based machinery is over an order of magnitude more time efficient than conventional sampling approaches.