Strength in numbers: the evolution of gene regulation in mammals
Evolution is thought to preserve regulatory DNA elements that are critical mediators of gene expression. By assessing the evolution of regulatory regions and their associated gene expression across placental mammals, we reveal a role for both complexity and constraint in maintaining gene expression levels within evolutionarily dynamic regulatory landscapes.
The paper in Nature Ecology & Evolution is here: http://go.nature.com/2iW24JP
Despite large advances in the last decade, our understanding of how gene expression is controlled remains patchy. Large fractions of non-coding DNA carry biochemical hallmarks of regulation in any given tissue; however, we still don't understand whether only some or possibly all of these regions contribute to gene regulation and by how much.
To address this question, our groups at Cancer Research UK – Cambridge Institute (CRUK-CI) and European Molecular Biology Laboratory’s European Bioinformatics Institute (EMBL-EBI) took a comparative approach. Evolutionary theory predicts that changes to essential DNA regions will, most of the time, be deleterious and filtered by natural selection. Under this model, regulatory elements important for gene expression should remain conserved in evolution. This expectation is reinforced by the fact that tissue-level gene expression is largely maintained between different species.
When this project started, however, a body of evidence seemed to contradict this expectation. Transcription factors, which control the gene expression programmes specific to each tissue in a mammalian body, were found to display surprisingly high divergence in their binding locations between vertebrates. The DNA motifs they recognized were the same; but the exact locations of these bound motifs in the genomes of human, mouse or chicken were mostly different. How then, does gene expression remain so stable? And what about whole regulatory elements—larger regions such as promoters and enhancers, able to bind collections of transcription factors combinatorially? Were these larger regions stable, although displaying plasticity in their regulatory content, or were they evolving under a different regime altogether?
Figure 1. One of the cetaceans sampled for this study - a Sei whale which stranded at Druridge Bay, Northumberland in 2012. Image courtesy of CSIP-ZSL.
To answer these questions, Diego Villar in Duncan Odom's group at CRUK-CI and Camille Berthelot from Paul Flicek's group at EMBL-EBI worked together to build a large collection of liver tissue samples from different species, conduct a large set of functional genomics experiments and devise the appropriate analysis. Sample acquisition for comparative regulatory genomics—in this case from more than twenty species spanning much of the mammalian tree—is particularly challenging for non-model animals. Tissues have to be processed quickly after the animal's death in order to retain their integrity and so we worked in close collaboration with a diverse array of conservation and research programmes (such as the UK Cetacean Stranding Investigation Programme, see Figure 1). These connections allowed us to obtain post-mortem tissues from under-studied species alongside more common laboratory and livestock animals.
Analysing and comparing data from so many species is both challenging and computationally intensive (as well as fun). From the beginning of the project, we worked as a tightly-knitted unit: we carefully designed the experimental work and analyses together in order to answer a fundamental question - how do regulatory elements evolve in mammalian genomes?
Figure 2. Empirical rates of promoter and enhancer evolution in mammalian liver. Adapted from Villar et al. 2015.
Three years, scores of meetings, and thousands of emails later, we published the first part of our work compared the evolution of promoters and enhancers in liver across twenty mammals (Figure 2). We found that they evolve in strikingly different ways: promoters, sitting close to genes, are relatively well-conserved; but enhancers, which often lie far from their target genes, evolve rapidly. In fact, up to 40% of the enhancers we detected showed activity in only a single species out of our twenty-species set. The natural next question was, do these pervasive changes impact gene expression, and how much?
In this issue of Nature Ecology & Evolution, we report the results of combining the evolutionary analysis of promoters and enhancers with quantitative transcriptional readouts from the same samples. Jointly analysing these two sources of data proved far from trivial, and involved many hours of exploratory analyses on both sides of our collaboration, as well as design of appropriate normalisation and analysis methods. After much work, considerable hair-pulling and (a few) dead ends, we found that genes tend to retain the overall complexity (which we define as the number of regulatory elements associated to a gene) of their surrounding regulatory landscape. This suggests that the strength of regulatory elements is in their numbers, rather than in their exact locations or content (Figure 3). Nevertheless, regulatory elements active in many species did have stronger effects on gene expression, consistent with core regulatory regions being preferentially maintained by natural selection. Altogether, these results illuminate the gene expression contributions of regulatory DNA elements evolving under regimes of low selection and high plasticity.
Figure 3. Maintenance of regulatory complexity around the APOB gene for three representative species (human, mouse and dog). Blue and orange peaks correspond to promoters and enhancers active in each species’ liver, and green tracks to raw gene expression data. Adapted from Berthelot et al. 2017 (this issue).