Domestication and genome evolution under the light of 1,011 yeast natural isolates
Elucidating the origins of the astonishing phenotypic diversity observed in natural populations is a major challenge in biology. In all species, genetic diversity is the raw material for phenotypic diversity. Genome-wide investigation of the patterns of polymorphism in a large number of individuals is a first essential step to dissect the relationship between genotype and phenotype.
Beginning of 2013, we initiated a comprehensive polymorphic and phenotypic survey in more than 1,000 isolates of the yeast model organism Saccharomyces cerevisiae (http://1002genomes.u-strasbg.fr/). At that time, yeast population genomic studies were limited to a small number of isolates (less than 100 strains), standing in stark contrast to large-scale projects conducted in other species including Arabidopsis thaliana (http://1001genomes.org/) and Human (http://www.1000genomes.org/). Consequently, our understanding of the genetic variation landscape in yeast was very limited. Our objective was to have a nearly exhaustive view of the genome architecture and genetic variants with the motivation to enable genome-wide association studies.
This large-scale study was conducted by my team (Université de Strasbourg / CNRS), the team of Gianni Liti (Université Côte d'Azur, CNRS, INSERM, IRCAN) and the Genoscope (Institut de biologie François Jacob du CEA, CNRS, Université d’Evry, Université Paris-Saclay) in the frame of a flagship project selected by the program France Génomique with the goal of generating a detailed map of the genetic in the classic model yeast Saccharomyces cerevisiae. The completion of whole genome sequencing of 1,011 natural isolates, plus the accompanying phenotyping efforts, led to one of the best understanding of population-level natural genetic and phenotypic diversity of any eukaryote model system. This study was published in the journal Nature (https://www.nature.com/articles/s41586-018-0030-5).
Sequenced strains were collected world-wide to sample as much diversity as possible in terms of global locations (including all continents), as well as ecological sources (both human-related such as dairy products, wine, sake, bread and wild niches as trees, insects, flowers, soil). These isolates were also phenotyped and their growth fitness were determined in different conditions impacting various physiological and cellular responses leading to a global view of the phenotypic landscape of this species.
Altogether the generated datasets allowed to highlight key points of the evolutionary history, genome evolution and its impact on the genotype-phenotype relationship. First, the exquisite detail with which the pattern of polymorphism was examined, allowed to dissect the species’ history. This study provided novel and clear evidences of East Asian origin and strongly suggest that S. cerevisiae started to disperse through the world from a single out-of-China event. As a result of human activity, S. cerevisiae then has undergone substantial genomic and phenotypic changes during multiple and independent domestication events underlying specific human processes (e.g. wine, sake and beer fermentations). Interestingly, these various domestication events differently impacted genome evolution. Whereas the sake and wine populations are characterized by a low genetic diversity, beer populations present a higher genetic as well as more complex genomic diversity. Furthermore, human-related environment foster expansion and loss of genes resulting in rampant variation in genome content. By contrast, wild isolates share a similar genome content and genetic diversity is mainly generated via the accumulation of mutations.
Second, the study also provided an overview of the respective importance of the various genomic features (e.g. ploidy, aneuploidy, introgressions, genetic variants) shaping genome evolution and consequently the species-wide phenotypic landscape. As an example, it was possible to define the core genome (i.e. 4,940 genes present in all 1,011 sequenced strains) and the variable genome (2,908 variable genes only present in a fraction of the population). Gene content is variable among isolates with a set of dispensable genes subject to segregation, introgressions (from closely related species) and horizontal gene transfer. As an example, horizontal gene transfer events are mostly restricted to S. cerevisiae present in domestic fermentative environments. In addition, ploidy and aneuploidy levels are variable between subpopulations and depend on their ecological origin. Finally, this study shed new light on the genotype-phenotype relationship in a natural population. The S. cerevisiae species presents a high level of genetic diversity, much greater than that found in humans. Among the 1,011 genomes, much of the detected genetic polymorphisms are very low-frequency variants with a trend like the one observed in the human population, raising questions regarding the impact and importance of rare variants on the phenotypic landscape within a population. Genome-wide association analyses highlighted the importance of the copy number variants, which explain a larger proportion of the phenotypic variance and have greater effects on phenotype compared to the single nucleotide polymorphisms. Beyond the analysis reported in the upcoming paper, this resource will enable powerful genetic and genomic studies in a key model system.