Scaling Changepoint Detection to Big Data

Changepoint detection is an increasingly important problem in a range of applications, for example to detect copy number variants. A common approach to inferring the number and position of the changepoints is to introduce a model for the data within a segment, and then maximise a penalised likelihood function. This maximisation can often be done exactly using dynamic programming, but the resulting algorithm has a computational cost that is quadratic, or even cubic, in the number of data points.

This talk will cover some recent algorithms that can maximise the penalised likelihood function exactly, but at a much lower computational cost. This includes the first such algorithm that can be shown, for certain models, to have an expected computational cost that is linear in the amount of data.