Title: Statistical algorithms in the study of mammalian DNA methylation
Speaker: Meromit Singer
Advisor: Lior Pachter
Date: Wednesday, November 28, 2012
Time: 1pm - 2pm
Room: 606 Soda Hall
DNA methylation is a dynamic chemical modification that is abundant on DNA sequences and plays a central role in the regulatory mechanisms of cells. This modification can be inherited across cell divisions and generations, providing a ``memory-mechanism" for regulatory programs that is more flexible than that coded in the DNA sequence. In recent years, high-throughput sequencing technologies have enabled genome-wide annotation of DNA methylation. Coupled with novel computational machinery, these developments have enabled unperceivable insight to the characteristics, biological function and disease association of this phenomenon.
In this talk I will present contributions to the field of high-throughput DNA methylation. We will first discuss a comparative study of genome-wide DNA methylation in three primate species: human, chimpanzee and orangutan, revealing that these species can be distinguished based on differences in DNA methylation that are independent of the underlying DNA sequence. This result is based on a novel algorithm that infers corrected site-specific methylation states given data from a cost-effective, but biased, experimental method. The ability to annotate DNA methylation at genome-wide scale leads to questions about the nature of methylation signatures in DNA, and an interesting computational question about how to recognize such signatures. We will discuss the question and a proposed solution that is optimal given biologically motivated assumptions. In the last part of the talk we will describe a systematic sequencing error we discovered during the analysis of a specialized methylation dataset. We will discuss why this error introduces false-positives to a broad range of high-throughput sequencing studies and will present a classifier to correct for such errors, showing that it performs very well with respect to both sensitivity and specificity.