Bayesian multivariate re-analysis of large genetic studies identifies many novel associations

Genome-wide association studies (GWAS) are now a common tool to identify genetic variants that affect traits of interest. To date, the NHGRI GWAS Catalog has over 24,000 SNP-phenotype associations. However, the vast majority of these GWAS are conducted in univariate frameworks, ie when genetic variants are only tested against a single phenotype one at a time. This is in contrast to multivariate frameworks where genetic variants are tested against different combinations of traits simultaneously. Under many biological scenarios, taking into account the context of multiple phenotypes drastically increases power. Additionally, by testing combinations of traits, multivariate frameworks allow researchers to investigate a greater level of biological complexity. Despite these clear advantages, multivariate analyses are seldom implemented. Univariate GWAS already involve a large computational and statistical burden; performing an additional, exponentially greater number of tests is highly deterring. Furthermore, it is often unclear how to properly compare different multivariate models even when they can be efficiently conducted.

Here, we present a framework and R package that alleviates these obstacles — Bayesian multivariate analysis of summary statistics, or bmass. bmass runs solely using univariate GWAS summary statistics. bmass can quickly conduct all possible multivariate analyses for up to 8 phenotypes. And bmass provides Bayes factors for each multivariate analysis, thus allowing models to be directly compared. Running bmass on various publicly available GWAS datasets consistently show an increase in power up to 40% over univariate approaches while keeping FDRs as low as 15%. bmass identifies many new significant associations as well as the phenotypic combinations driving these associations, thus providing novel levels of biological insight. Overall, bmass is a powerful tool that should further enable researchers to perform multivariate analysis of GWAS.