Top blue bar image Department of Computer Science

A 3D Code in the Human Genome

Colloquium

Computer Science

By: Erez Lieberman Aiden
Assistant Professor
From: Baylor College of Medicine & Rice University
When: Thursday, September 28, 2017
4:00 PM - 5:00 PM
Where: Duncan Hall
1070
Abstract: Stretched out from end-to-end, the human genome – a sequence of 3 billion chemical letters inscribed in a molecule called DNA – is over 2 meters long. Famously, short stretches of DNA fold into a double helix, which wind around histone proteins to form the 10nm fiber. But what about longer pieces? Does the genome’s fold influence function? How does the information contained in such an ultra-dense packing even remain accessible? In this talk, I describe our work developing ‘Hi-C’ (Lieberman-Aiden et al., Science, 2009; Aiden, Science, 2011) and more recently ‘in-situ Hi-C’ (Rao & Huntley et al., Cell, 2014), which use proximity ligation to transform pairs of physically adjacent DNA loci into chimeric DNA sequences. Sequencing a library of such chimeras makes it possible to create genome-wide maps of physical contacts between pairs of loci, revealing features of genome folding in 3D. Next, I will describe recent work using in situ Hi-C to construct haploid and diploid maps of nine cell types. The densest, in human lymphoblastoid cells, contains 4.9 billion contacts, achieving 1 kb resolution. We find that genomes are partitioned into contact domains (median length, 185 kb), which are associated with distinct patterns of histone marks and segregate into six subcompartments. We identify ?10,000 loops. These loops frequently link promoters and enhancers, correlate with gene activation, and show conservation across cell types and species. Loop anchors typically occur at domain boundaries and bind the protein CTCF. The CTCF motifs at loop anchors occur predominantly (>90%) in a convergent orientation, with the asymmetric motifs “facing” one another. Next, I will discuss the biophysical mechanism that underlies chromatin looping. Specifically, our data is consistent with the formation of loops by extrusion (Sanborn & Rao et al., PNAS, 2015). In fact, in many cases, the local structure of Hi-C maps may be predicted in silico based on patterns of CTCF binding and an extrusion-based model. Finally, I will show that by modifying CTCF motifs using CRISPR, we can reliably add, move, and delete
Erez Lieberman Aiden
Bio:
Erez Lieberman Aiden received his PhD from Harvard and MIT in 2010. After several years at Harvard's Society of Fellows and at Google as Visiting Faculty, he became Assistant Professor of Genetics at Baylor College of Medicine and of Computer Science and Applied Mathematics at Rice University.

Dr. Aiden's inventions include the Hi-C method for three-dimensional DNA sequencing, which enables scientists to examine how the two-meter long human genome folds up inside the tiny space of the cell nucleus (Lieberman-Aiden & Van Berkum et al., Science, 2009). In 2014, his laboratory reported the first comprehensive map of loops across the human genome, mapping their anchors with single-base-pair resolution (Rao & Huntley et al., Cell, 2014). In 2015, his lab showed that these loops form by extrusion, and that it is possible to add and remove loops and domains in a predictable fashion using targeted mutations as short as a single base pair (Sanborn & Rao et al., PNAS, 2014). In 2017, his lab showed that it is possible to use 3D maps, generated using Hi-C, to assemble mammalian genomes, entirely from scratch, from short reads alone, at a total cost of under $10,000 (Dudchenko et al., Cell, 2014). Using this methodology, the Aiden lab reported the first end-to-end genome of the Aedes aegypti genome, which carries the Zika virus. Assembling the Aedes aegypti genome from end-to-end had been highlighted as essential to the worldwide Zika response by a front page article in the New York Times.

In addition, together with Jean-Baptiste Michel, Dr. Aiden also developed the Google Ngram Viewer, a tool for probing cultural change by exploring the frequency of words and phrases in books over the centuries. Now a product at Google, the Ngram Viewer is used every day by millions of people worldwide.

Dr. Aiden's research has won numerous awards, including recognition for one of the top 20 "Biotech Breakthroughs that will Change Medicine", by Popular Mechanics, membership in Technology Review's 2009 TR35, recognizing the top 35 innovators under 35; and in Cell's 2014 40 Under 40. His work has been featured on the front page of the New York Times, the Boston Globe, the Wall Street Journal, and the Houston Chronicle. One of his talks has been viewed over 1 million times at TED.com. Three of his research papers have appeared on the cover of Nature and Science. In 2012, he received the President's Early Career Award in Science and Engineering, the highest government honor for young scientists, from Barack Obama. In 2014, Fast Company called him "America's brightest young academic." In 2015, his laboratory was recognized on the floor of the US House of Representatives for its discoveries about the structure of DNA.