The reconstructed timescale for the evolutionary history of life at its very earliest stages has traditionally been highly unstable. This was a consequence of the application of a purely fossil based approach to the study of early life where each new discovery and reappraisal caused a fluctuation in the timeline. Unfortunately, the fossil record becomes ever sparser as we move back in time over Earth’s history, with the problem being most apparent in the Hadean and Archaen, >2.5 billion years ago, for which there is a dearth of outcrops to study. Within these outcrops there are few things that we would even consider to be fossils, let alone a concrete record of any particular lineage. These difficulties are in contrast to the increasing wealth of molecular data from extant organisms across the tree of life. Accordingly, we attempted to provide a new outlook on the evolution of the earliest branches of life integrating genomic and fossil data using a “relaxed molecular clock approach”, loosely based on the idea that the number of differences in the genomes of two living species (say a human and a bacterium) are proportional to the time since they shared a common ancestor, to produce what is called a Time tree of life; a dated tree of life.
In this study we combined protein sequence data from 102 living species spanning the 3 major lineages of life, Eubacteria, Archaebacteria and Eukaryota, as well information from 9 fossils to inform 11 nodes in the tree of life. Calibrations work to anchor the analysis in real time. They specify that a node must be older than a certain fossil, because, if that fossil exists, then the lineage it belongs to must already exist as well. This very point means that the fossil record, though obviously indispensable, cannot give us a true estimate of the age of any clade. Molecular clocks help us to see into the time before the fossils existed. They are most important in the very earliest parts of life, where there is the least fossil evidence to use, but this precisely when they have been applied the least. As you might expect, the credibility intervals (uncertainty) of the age estimates from our combined molecular and fossil analysis are in general rather wide, especially for the parts of the tree for which we have the least data, whether fossil or molecular. To me, this is encouraging because it suggests that the clocks are not providing a false degree of confidence. It means that the timescale we propose can be refined and updated as new modern lineages and – especially – new early fossils are discovered.
Our results include the output from analyses performed with a variety of different parameters and show that the last universal common ancestor (LUCA) of life existed prior to an event known as the late heavy bombardment (>3.9 billion years ago). Crown group Eubacteria and Archaebacteria then appear almost one billion years later in the Palaeoarchaean. Our timescale lines up with those that have been previously published for the eukaryotes and has some interesting results regarding the origination of two bacterial lineages that are important for understanding the evolution of complex eukaryotic cells. Crown Cyanobacteria radiate after the Great Oxidation Event (GOE), suggesting that photosynthesis evolved along the stem-cyanobacterial lineage, and total-group alphaproteobacterial, to which the free-living ancestor of the mitochondrion belonged have an overlapping credibility interval with the evolution of crown eukaryotes, suggesting a fast radiation of the eukaryotic lineages in the wake of the mitochondrial symbiosis. The hope is that this project will act as a starting point for other research using integrative methods to think about the very earliest lineages of life.
Our paper in Nature Ecology & Evolution can be found here.