Duplicate divergent mitochondrial genomes in the Tuatara with complex organization
This study across two decades (March 2002-January 2021) has layers of investigation elucidating the dynamic discovery of duplicate divergent mitochondrial genomes in the Tuatara. They represent an ancient lineage of reptiles ~250 million years old, and survived on 40 small islands in New Zealand.
Large bodied and living in cool environments, sets the Tuatara apart from other reptilian life histories. The cells mitochondria are responsible for metabolism and contain their own small circular genome, typically only maternally inherited. The animal mitochondrial genome normally consists of 37 genes (13 protein, 2 rRNA, and 22 tRNA coding), with a relatively stable arrangement among vertebrates. The deep-lineage of the Tuatara posed an intriguing question of what might of happened to it’s mitochondrial genome (mt-genome) in isolation along the way in a cool environment?
While a Research Scientist at the DOE Joint Genome Institute (JGI) in 2002 I amplified a section of the mt-genome from the COIII gene to the 12S rRNA gene, and obtained two PCR fragments of different sizes. Not being able to separate them I shotgun sequenced them together for 768 Sanger DNA sequencing reads. I obtained two assembled contigs of DNA sequencing reads with different genes but combined they included the genes for ND5, tRNA-Ser(AGY), and tRNA-Thr, with divergent sequence in the second half of the ND5 gene. This lead me to think there maybe two mt-genomes in the Tuatara. Shortly after in 2003, Rest et al. published the complete mt-genome of the Tuatara reporting two replication origins known as the Control Region but missing the genes listed above that I had sequenced.
In 2012, Neil Gemmell of the University of Otago in New Zealand contacted me regarding work on the Tuatara Genome Project, with the whole genome now (2020) published in Nature. As part of this project I explored detailed aspects of the mt-genome with large genome-scale Illumina shotgun data. Stephan Pabinger, now in Austria, assembled reads, but at first not including the “missing genes”. After six rounds of assembly with me providing template sequences of “the missing genes” Stephan was able to assemble the complete mt-genome but it had three duplicated near identical Control Regions. I was excited but concerned as to whether it was correct as this assembly was done with small pieces of DNA put back together on a computer.
In 2014 an unprecedented opportunity arose with the Peralta Community College Genomics Laboratory being selected as the only 4-year and below institution to get early access to unmarketed long-read Oxford Nanopore DNA sequencing. Leading this effort was Peralta student Charles Barbieri who sequenced the entire mt-genome for both DNA strands of the double helix connected at one end in one single DNA reaction. This unprecedented result confirmed the mt-genomic architecture with “the missing genes” and three Control Regions. At this time another 34 Tuatara mt-genomes were reported to not include the “missing genes” and having only two Control Regions; hence matching the Rest et al. mt-genome giving us great concern that an incorrect template mt-genome was being copied more than a decade later.
In 2018, Neil Gemmell became aware of Dan Mulcahy and associates at the Smithsonian Institute having found the missing ND5 gene in a published transcriptome library. Dan, having been my student at JGI conducting complete mt-genomic work at the time of the first work on this project, was a natural collaborator. My Peralta students, Charles Barbieri, Dustin Demeo, and Aaron Elliott and I still were wondering if there might actually be two mt-genomes in the data we were evaluating and databasing (Illumina and Oxford Nanopore).
Working with Dan Mulcahy, Vanessa Gonzalez and their high school student Ella Buring I sent tissue samples to the Smithsonian Institute from animals at the St. Louis Zoo and Living Earth Collaborative to conduct long-read PacBio sequencing in search of the second mt-genome. While this effort failed to find the second mt-genome it did provide a complete mt-genome with “the missing genes” and three Control Regions in a southern population from Stephans Island covering the geographic ends of Tuatara distribution.
In the search for a second mt-genome Dan Mulcahy and I were reassembling the mt-genome from Stephan Pabinger’s reads, when Dan found a divergent assembled section of 1.1 kb in the rRNA genes and said “Bob I think there are two mitochondrial genomes”. I looked at the 1.1 kb section, recognizing the sequence as mitochondrial and not from the nuclear genome. This is when I realized we were in business; we saw 10% sequence divergence with strong strand bias, so I told Dan “this really is the second mt-genome.”
I was ecstatic when starting the process of putting sections of the mt-genome together and sorting them between the two mt-genome copies with Dan. We needed more short read Illumina reads, so we had Stephan Pabinger mine every Illumina sequencing run conducted in the Tuatara Genome Project for mitochondrial DNA. This allowed Dan and I to put together the entire second mt-genome which turned out to be in 1/7th concentration of the dominate first molecule. We actually did it a second time to check everything, providing the same result.
To further document the second mt-genome we evaluated long-read Oxford Nanopore data by setting a 5% divergence criterion as those data suffer from skipping bases during reading. After Charles Barbieri did two additional Oxford Nanopore sequencing runs we were able to cover the second mt-genome 96%, with the first mt-molecule 100% covered in a single sequencing reaction.
Confident of the second mt-genome and on the advice of a Nature editor we explored selection of encoded mitochondrial membrane proteins. Lara Urban (New Zealand) conducted a detailed analysis that discovered positive selection of amino acids in transmembrane regions showing a functional relationship of change in the second mt-genome. This finding lends thought that having two mt-genomes may provide an adaptive advantage for the large bodied cold-blooded reptile living in a cool environment.
Macey, J.R. et al. Evidence of two deeply divergent co-existing mitochondrial genomes in the Tuatara reveals an extremely complex genomic organization. Comms. Bio. 4:116 (2021) https://doi.org/10.1038/s42003-020-01639-0
Rest, J.S. et al. Molecular systematics of primary reptilian lineages and the Tuatara mitochondrial genome. Mol. Phylogenet. Evol. 29, 289–297 (2003). https://doi.org/10.1016/S1055-7903(03)00108-8
Gemmell, N.J. et al. The tuatara genome reveals ancient features of amniote evolution. Nature 584, 403–409 (2020). https://www.nature.com/articles/s41586-020-2561-9
Mohandesan, E., Subramanian, S., Millar, C.D. and Lambert, D.M. Complete mitochondrial genomes of Tuatara endemic to different islands of New Zealand. Mitochondrial DNA 26, 25–26 (2015; online 2013). https://doi.org/10.3109/19401736.2013.840613
Subramanian, S., Mohandesan, E., Millar, C.D. and Lambert, D.M. Distance-dependent patterns of molecular divergences in Tuatara mitogenomes. Sci. Rep. 5, 8703 (2015). https://doi.org/10.1038/srep08703
Miller, H.C., Biggs, P.J., Voelckel, C., and N.J. Nelson. 2012. De novo sequence assembly and characterisation of a partial transcriptome for an evolutionarily distinct reptile, the tuatara (Sphenodon punctatus). BMC Genomics 13, 439 (2012). https://doi.org/10.1186/1471-2164-13-439