Skip to main content.
Submit or edit an event >
Advanced search >
<< Back to previous page Print

<< Thursday, November 12, 2009 >>


Remind me

Tell a friend

Add to my appointment calendar

Bookmark and ShareShare


Statmap: A utility for the principled mapping of short reads to a reference genome

Seminar | November 12 | 4-5 p.m. | 1011 Evans Hall


Nathan Boley, UC Berkeley, Department of Statistics

Statistics, Department of


Next generation sequencing technologies have given rise to a host of assays that are able to quickly answer a diverse set of biological questions. These assays, which include RNA-seq, ChIP-seq, methyl-seq, Hi-C-seq, and DNase-seq are similar in that, at the end of a "*-seq" experiment, they result in a set of sequences, or 'reads', generated by the sequencing platform, and it is from these that we draw our conclusions. Hence, the first key analytical task in the analysis of these assays is to "map" the reads into the space from which they came (e.g. the genome, the transcriptome, etc...). As the assays have developed, the biological questions they attempt to answer have become more subtle, and the downstream analyses into they are integrated have become increasingly complex. The need for reliable measures of statistical confidence in biological interpretations has become apparent, and thus too has the need for tools that are able to map the results of an experiment in a way that provides information about mapping uncertainty to downstream analysis.

Statmap is one such tool. It is exceptionally fast, and it produces every possible mapping from which a read could have come, up to a threshold in sequencing error and/or alternate genome probability. In addition, statmap can map paired end reads, junction reads, polya tails and update mapping probabilities under an assay specific model. The probability model and the architecture that underly statmap, as well as its application to downstream analysis and the generation of confidence bounds will be discussed. This software is currently available at encodestatistics.org.


510-642-2781