Identifying recently evolved genes in yeast

Genes born de novo from previously non-genic sequences are important to explain recent adaptations but they are often missing from the annotations. By comparing the transcriptomes of 11 different species we could identify more than 200 de novo originated transcripts in baker’s yeast.

Like Comment
Read the paper

Our group has been investigating de novo gene birth for over 15 years. We have been focusing on humans and mice, but this time was going to be different. I had just given an internal seminar on de novo genes at the Barcelona Biomedical Research Park when Lucas Carey, one of the new PIs in the building, approached me to ask if I was interested in doing something related with baker’s yeast. I had not considered it before- several groups were already working with yeast  – but it seemed like a good opportunity to start a new collaboration and explore a system that we could easily manipulate in vivo.

Terrace of the PRBB next to the Auditorium. A nice place to sit and discuss science after seminars.

It was 2015 and a new MSc student in the group, Will Blevins, was looking for a research project related to de novo genes, which could potentially extend into a PhD. I proposed that he could start to work with yeast and he liked the idea. Lucas and I became his co-supervisors.

After finishing his MSc, Will decided to stay for a PhD. During his time at the PRBB he engaged in a variety of science outreach activities. Here he is introducing the concepts behind the research in our lab at the 2018 Youth Mobile Festival (YoMo).

The plan included  sequencing the transcriptomes of different species and comparing the sequences and genomic positions of the expressed transcripts, in other words, to characterize de novo gene birth without having to rely on the existing annotations. As Lucas’ group was working on gene regulatory networks, we also thought about measuring the activity of promoters of new genes compared to the same region in species that did not express the gene, using reporter constructs. These experiments never came to fruition, so instead we did some ribosome profiling experiments with Juana Diez’s group, a third collaborating lab in the same building.

Toy example to explain our research into de novo genes using colored wooden shapes, which represent a set of genes from several closely-related species.  Over evolutionary time, some genes accumulate subtle changes – modifying their shape. In other cases, shapes appear or disappear alltogether, making them unique to one species.

In organisms with large genomes, such as those of mammals, there is a lot of non-coding sequence that can be used to build new genes. However in species with very compact genomes, like those of yeast, there is not much “spare” genomic sequence. For example, in baker’s yeast i.e. Saccharomyces cerevisiae, our reference species, about 70% of the genome is occupied by coding sequences. That doesn’t leave much left for the trial-and-error process of de novo gene birth. What we discovered in our study was that many new transcripts originated in regions that already had genes, but in the opposite orientation. So, in this highly efficient genome, the “spare” DNA strand was being used to harbor new transcripts. Not only that, but a significant fraction of the new antisense transcripts were being translated into proteins.

 We estimated that more than 200 transcripts are likely to have originated de novo in the recent evolution of baker’s yeast lineage, which represents about 5% of all the transcripts that we detected. It appears that, when in antisense orientation, new genes tend to show a similar expression regulation pattern to the gene which they overlap. We did not expect this, and  the implications are quite profund; it means that many of the new genes in yeast do not begin as independent entities – as generally assumed – but that they are likely to be involved in similar cellular processes as the “host” gene. The coding sequences frequently overlap and so they must co-evolve too, but this is something that we will leave for the future.

Mar Albà

ICREA Research Professor, Hospital del Mar Research Institute (IMIM)

I am interested in the evolution of genomes and proteins and in using computational tools to analyze large amounts of data. Our group has been working on a diversity of subjects in the past 15 years, including the evolution of new genes, the impact of repetitive sequences in protein evolution, the detection of natural selection signatures and the identification of changes in gene expression programs. Our webpage is evolutionarygenomics.imim.es