Dirichlet Mixtures, the Dirichlet Process, and the Topography of Amino Acid Multinomial Space
The Dirichlet Process is used to estimate probability distributions that are mixtures of an unknown and unbounded number of components. Amino acid frequencies at homologous positions within related proteins have been fruitfully modeled by Dirichlet mixtures, and we have used the Dirichlet Process to construct such distributions. The resulting mixtures describe multiple alignment data substantially better than do those previously derived. They consist of over 500 components, in contrast to fewer than 40 previously, and provide a novel perspective on protein structure. Individual protein positions should be seen not as falling into one of several categories, but rather as arrayed near probability ridges winding through amino-acid multinomial space.
Date: 23 May 2017, 15:30 (Tuesday, 5th week, Trinity 2017)
Venue: 24-29 St Giles', 24-29 St Giles' OX1 3LB
Venue Details: Large Lecture Theatre, Department of Statistics
Speaker: Dr Stephen Altschul ( National Center for Biotechnology Information)
Organiser: Prof Jotun Hein (University of Oxford)
Organiser contact email address: events@stats.ox.ac.uk
Topics:
Booking required?: Not required
Audience: Members of the University only
Editor: Beverley Lane