Bayesian Covariance Estimation with Applications in High-throughput Biology

Seminar | February 6 | 4-5 p.m. | 1011 Evans Hall

 Alexander Franks, University of Washington

 Department of Statistics

Understanding the function of biological molecules requires statistical methods for assessing covariability across multiple dimensions as well as accounting for complex measurement error and missing data. In this talk, I will discuss two models for covariance estimation which have applications in molecular biology. In the first half of the talk, I will describe the role of covariance estimation in quantifying how cells regulate protein levels. Specifically, estimates of the correlation between steady-state levels of mRNA and protein are used to assess the degree to which protein levels are determined by post-transcriptional processes. Differences in cell preparation, measurement technology and protocol, as well as the pervasiveness of missing data complicate the accurate estimation of this correlation. To address these issues, I fit a Bayesian hierarchical model to a compendium of 58 data sets from multiple labs to infer a structured covariance matrix of measurements. I contextualize and contrast our results to conclusions drawn in previous studies. In the second part of the talk, I will describe a model-based method for evaluating heterogeneity among several p x p covariance matrices in the large p, small n setting and will illustrate the utility of the method for exploratory analyses of high-dimensional multivariate gene expression data.