Pedigree-based and phylogenetic methods support surprising patterns of mutation rate and spectrum in the gray mouse lemur
Using a "trio" pedigree approach for comparing whole genome sequences in a family of mouse lemurs, we discovered a surprisingly high de novo mutation rate and unexpected patterns in the mutational spectrum.
From the time that Zuckerkandl and Pauling (1965) published their groundbreaking paper on molecular clocks, genetic sequence divergence has been used to estimate speciation dates across the Tree of Life. Though it is now well understood that molecular clocks tick at different rates among lineages, sophisticated methods have been developed to model this rate variation. Using phylogenetic methods, investigators can use the branch lengths of evolutionary trees (measured as numbers of substitutions) to estimate divergence times and, accordingly, situate speciation events in their proper temporal and geological context. But there is a fundamental challenge to the phylogenetic approach. Branch lengths are the product of time and molecular rate, and to uncouple the two, one must have some prior knowledge about one or the other. The most common approach for disentangling rate from time is to use fossil dates to "calibrate" the molecular clock. But what if there are no relevant fossils available? Such, alas, is the case for any biologist wishing to estimate divergence times for Madagascar's mammals, as their post-K-Pg fossil record is nonexistent ... and this is hardly a unique phenomenon given the many species throughout the Tree of Life that have scant or entirely absent fossil records. In such cases, we need to find a way to "free ourselves from fossils" when estimating the age of speciation events.
Our project began innocently in the summer of 2016, around a dining room table, at a beach-side lab retreat. Visiting colleague Mario dos Reis enthused that by estimating de novo mutation rates across multiple species of lemurs, we could finally put a definitive age estimate on their origins in Madagascar. The number of germline mutations that occur in a single generation can be counted from the sequenced genomes of parents and their offspring (so-called "trio" studies), thus offering the means to estimate the mutation rate, and ultimately, divergence times (Tiley et al., 2020). The rest of us gathered there, including first author Ryan Campbell (a PhD student at the time, now postdoc), Peter Larsen (then a postdoc, now assistant professor), and Kelsie Hunnicutt (then lab technician, now PhD student) immediately jumped on board. We decided to start by sequencing and counting mutations in the genomes of mouse lemur families. It all sounded so easy.
Counting de novo mutations is far from trivial, however. Even using the most accurate sequencing platforms available, which yield over 99.999% sequencing accuracy, as many as 28,000 sequencing errors might occur in a typical 2.8 Gb primate genome. This is orders of magnitude higher than the roughly 100 de novo mutations expected in a single individual. Errors can also propagate from library preparation, sequence alignment, genotyping, and somatic mutations – all conspiring to create a cruel reality wherein we are looking for needles in a haystack. Much of the work of detecting de novo mutations therefore requires computational approaches for avoiding false positives. Ryan was the resident computer whiz, so we promptly turned to him to lead the charge by incorporating the project into his Ph.D. thesis on speciation genomics. From the beginning, our plan was to sequence a four-generation mouse lemur pedigree descending from great-grandparents. As it turned out, however, sequencing led to the startling discovery that the great-grandfather (Pesto) was not the great-grandfather at all!
Figure 1: Mouse lemurs are the world's smallest primates and usually give birth to twins or triplets (photo credit David Haring, ©Duke Lemur Center).
And here, we need to take a small deviation to talk about the sex lives of mouse lemurs. In nature, mouse lemurs live in multi-male multi-female groups with male-male conflict playing a critical role in successful reproduction. In captive populations, such as that of the Duke Lemur Center, where our individuals were sampled, it is common practice for two males to be housed together in a cage that is next to, but not connected with, the receptive female's cage. As she comes closer to estrous, the males fight for her favors, with the crescendo being the moment that the intervening barrier is removed so that males and female can "interact," hopefully leading to a successful breeding. The husbandry staff carefully monitor the preceding battle with the presumption being that the victorious male must certainly be the successful sire. In Pesto's case, however, though he may have been victorious in battle, he was (apparently) unsuccessful in love — a discovery made only years later when his genome told the tale of his disappointment. Meanwhile, the clock was ticking on Ryan's dissertation, so we moved ahead with the pedigree in hand: a father, mother, daughter, and son.
Though we were prepared for the possibility that both the de novo mutation rate and spectrum in mouse lemurs might be different from those observed in humans and the few other anthropoid primates to have been tested thus far, we were more than a little surprised by the results: the calculated rate is the second highest yet recorded for a primate (after orangutan), the expected bias in paternally inherited mutations is virtually non-existent, and — most surprising of all — we observed only a modest overrepresentation of mutations at CpG-sites. Indeed, the results were so surprising that we urgently began consulting colleagues (lots of them!) with expertise greater than ours in the nuances of de novo mutation rate estimation. Their consensus: our results must be polluted with false positives. One colleague, however, offered an excellent idea for verifying our results, suggesting, "Why not double-check them in comparison to substitution rates?"
And so, we entered Phase II of the study by adding co-lead author George Tiley, as well as colleagues Jeff Thorne and Hui-Jie Lee, who could bring additional phylogenetic chops to the team. By conducting an independent analysis of context-dependent substitution types for mouse lemur and five additional primate species – for which de novo mutation rates were previously estimated – we were gratified to find that the substitution rate analyses were consistent with the de novo mutation spectrum. In other words, though our results were unexpected and novel, they not only held up well in comparison to substitution rates, but showed the power of implementing the dual approaches of phylogenetics and genomics in rate comparisons. Frankly, we felt satisfied that we could stop there. But thanks to constructive reviews, we further explored uncertainty in our pedigree-based results and brought them into phase with the rapidly evolving standards of de novo rate estimation. The most useful outcome of the study, we believe, is that it clearly demonstrates the extreme sensitivity of rate measurement to bioinformatic filtering.
Figure 2: Table from the paper showing response of rate estimate to different levels of bioinformatic filtering.
Finally, you might ask, was all of this worth it with regard to divergence-time estimation? The answer is a resounding "yes!" When using our directly estimated mutation rate, we recovered divergence dates that are considerably younger than those calculated in a previous study (Yoder et al., 2016) wherein the rate was treated as an average of mouse (the rodent) and human rates. The younger dates certainly make more sense than those previously calculated in that they place the mouse lemur radiation squarely within the timeframe of Pleistocene climate cycles — a pattern that is seen broadly across taxa and geographic regions around the globe. Moreover, given the relative precision of the directly estimated rate, there is much less statistical uncertainty around the age estimates.
In summary, getting this paper into the press has been quite a journey, though one that we are quite satisfied with, having learned that surprising results can ultimately be the best kind.
You can have a look at our paper here!
Figure 3: Using the de novo rate estimated directly from mouse lemurs has a significant impact on divergence time estimates as shown in this figure from the paper.
Other papers referred to in this post:
Tiley GP, Poelstra JW, dos Reis M, Yang Z, Yoder AD. 2020. Molecular clocks without rocks: new solutions for old problems. Trends in Genetics 36: 845 - 856.
Yoder AD, Campbell CR, Blanco MB, Dos Reis M, Ganzhorn JU, Goodman SM, Hunnicutt KE, Larsen PA, Kappeler PM, Rasoloarison RM, Ralison JM, Swofford DL, Weisrock DW. 2016. Geogenetic patterns in mouse lemurs (genus Microcebus) reveal the ghosts of Madagascar's forests past. Proc Natl Acad Sci U S A 113: 8049-8056.
Zuckerkandl E, Pauling L. 1965. Molecules as documents of evolutionary history. J Theor Biol 8: 357-366.