Statistical and Computational Challenges in Conformational Biology

Seminar | April 4 | 4-5 p.m. | 1011 Evans Hall

 Professor Mark Segal, Department of Epidemiology and Biostatistics, UC San Francisco

 Department of Statistics

Chromatin architecture is critical to numerous cellular processes including gene regulation, while conformational disruption can be oncogenic. Accordingly, discerning chromatin configuration is of basic importance, however, this task is complicated by a number of factors including scale, compaction, dynamics, and inter-cellular variation.

The recent emergence of a suite of proximity ligation-based assays, notably Hi-C, has transformed conformational biology with, for example, the elicitation of topological and contact domains providing a high resolution view of genome organization. Such conformation capture assays provide proxies for pairwise distances between genomic loci which can be used to infer 3D coordinates, although much downstream analysis bypasses this reconstruction step.

After demonstrating advantages deriving from obtaining 3D genome reconstructions, in particular from superposing genomic attributes on a reconstruction and identifying extrema (’3D hotspots’) thereof, we showcase methodological challenges surrounding such analyses, as well as advancing a novel reconstruction approach based on principal curves. Open issues highlighted include (i) performing and synthesizing reconstructions from single-cell assays, (ii) devising rotation invariant methods for 3D hotspot detection, (iii) assessing genome reconstruction accuracy, and (iv) averting reconstruction uncertainty by direct integration of Hi-C data and genomic features. By using p-values from (epi)genome wide association studies as the feature the latter approach provides a conformational lens for viewing GWAS findings.