Data science topics related to neurogenomics
My seminar will discuss various data-science issues related to
neurogenomics. First, I will focus on classic disorders of the brain,
which affect nearly a fifth of the world’s population. Robust
phenotype-genotype associations have been established for several
psychiatric diseases (e.g., schizophrenia, bipolar disorder). However,
understanding their molecular causes is still a challenge. To address
this, the PsychENCODE consortium generated thousands of transcriptome
(bulk and single-cell) datasets from 1,866 individuals. Using these
data, we have developed interpretable machine learning approaches for
deciphering functional genomic elements and linkages in the brain and
psychiatric disorders. Specifically, we developed a deep-learning
model embedding the physical regulatory network to predict phenotype
from genotype. Our model uses a conditional Deep Boltzmann Machine
architecture and introduces lateral connectivity at the visible layer
to embed the biological structure learned from the regulatory network
and QTL linkages. Our model improves disease prediction (6X compared
to additive polygenic risk scores), highlights key genes for
disorders, and imputes missing transcriptome information from genotype
data alone. Next, I will look at the “data exhaust” from this activity – that is, how one can find other things from the genomic analyses
than what is necessarily intended. I will focus on genomic privacy,
which is a main stumbling block in tackling problems in large-scale
neurogenomics. In particular, I will look at how the quantifications
of expression levels can reveal something about the subjects studied
and how one can take steps to sanitize the data and protect patient
anonymity. Finally, another stumbling block in neurogenomics is more
accurately and precisely phenotyping the individuals. I will discuss
some preliminary work we’ve done in digital phenotyping.
Date: 11 February 2022, 14:00
Venue: Mathematical Institute, Woodstock Road OX2 6GG
Venue Details: L3
Speaker: Prof Mark Gerstein (Yale University)
Organising department: Mathematical Institute
Organiser: Sara Jolliffe (University of Oxford)
Organiser contact email address: sara.jolliffe@maths.ox.ac.uk
Host: Dr Peter Minary (University of Oxford)
Part of: Mathematical Biology and Ecology
Booking required?: Not required
Audience: Members of the University only
Editor: Sara Jolliffe