Discovery of Parkinson's disease states and disease progression modelling: a longitudinal data study using machine learning

Parkinson’s disease is heterogeneous in symptom presentation and progression. Increased understanding of both aspects can enable better patient management and improve clinical trial design. Previous approaches to modelling Parkinson’s disease progression assumed static progression trajectories within subgroups and have not adequately accounted for complex medication effects. Our objective was to develop a statistical progression model of Parkinson’s disease that accounts for intra-individual and inter-individual variability and medication effects.

In this longitudinal data study, data were collected for up to 7-years on 423 patients with early Parkinson’s disease and 196 healthy controls from the Parkinson’s Progression Markers Initiative (PPMI) longitudinal observational study. A contrastive latent variable model was applied followed by a novel personalised input-output hidden Markov model to define disease states. Clinical significance of the states was assessed using statistical tests on seven key motor or cognitive outcomes (mild cognitive impairment, dementia, dyskinesia, presence of motor fluctuations, functional impairment from motor fluctuations, Hoehn and Yahr score, and death) not used in the learning phase. The results were validated in an independent sample of 610 patients with Parkinson’s disease from the National Institute of Neurological Disorders and Stroke Parkinson’s Disease Biomarker Program (PDBP).

We developed a statistical progression model of early Parkinson’s disease that accounts for intra-individual and inter-individual variability and medication effects. Our predictive model discovered non-sequential, overlapping disease progression trajectories, supporting the use of non-deterministic disease progression models, and suggesting static subtype assignment might be ineffective at capturing the full spectrum of Parkinson’s disease progression.