Accurate proteome-wide missense variant effect prediction with AlphaMissense

The vast majority of missense variants observed in the human genome are of unknown clinical significance. Machine learning approaches could close this variant interpretation gap by exploiting patterns in biological data to predict the pathogenicity of unannotated variants. I will discuss AlphaMissense, which combines advances of the highly-accurate structure prediction model, AlphaFold, and population variant data to predict missense variant pathogenicity. We demonstrate state-of-the-art predictions on clinically-asscertained labels and experimental benchmarks, without explicitly training on such data. Due to higher predictive performance, the fraction of ClinVar test variants that we can confidently classify with 90% precision has increased by 25.8 percentage points (from 67.1% to 92.9%) compared to the recent well-performing unsupervised model EVE. I will also cover aspects of model evaluation, interpretation and utility. For instance, we find that gene level AlphaMissense scores are predictive of genes essential to cell survival, and this property holds amongst the ~22% of smaller genes, which methods based only on population cohort data lack statistical power to detect reliably.

Speaker
Dr. Clare Bycroft is a research scientist at Google DeepMind with a background in human genetics. She has a particular focus on ensuring the utility of deep learning models in real-world settings. Previously, Clare worked with Genomics PLC, an Oxford-based biotech using human genetics data to propose new therapeutic targets; and during her DPhil (Welcome Centre for Human Genetics) curated the first tranche of the UK Biobank genotyping data.

Date: 15 November 2023, 16:00
Venue:
Big Data Institute
Old Road Campus OX3 7LF
See location on maps.ox

Details: Seminar Room 0
Speaker: Clare ByCroft (DeepMind)
Organising department: Big Data Institute (NDPH)
Organisers: Prof Christopher Yau (University of Oxford), Sumeeta Maheshwari (University of Oxford)
Organiser contact email address: sumeeta.maheshwari@ndph.ox.ac.uk
Host: Prof Christopher Yau (University of Oxford)
Part of: Machine Learning@BDI Seminar Series
Booking required?: Required
Booking url: https://forms.office.com/e/QUM6xRmirx
Booking email: sumeeta.maheshwari@ndph.ox.ac.uk
Cost: free
Audience: Members of the University only
This talk features in the following public collections:
- Talks of Interest to Medical Sciences
Editor: Sumeeta Maheshwari