Evolutionary trade-offs shaping the immune system of vertebrates
The extreme polymorphism of the immune MHC genes is one of the text-book cases of balancing selection and a favourite example for many evolutionary biologist. The system, however, holds an often neglected paradox – why are there so many variants in populations and so few in particular individuals?
Major histocompatibility complex (MHC) plays a key role in the differentiations of healthy molecules of ‘self’, from possibly harmful and dangerous ‘non-self’ or ‘altered-self’, that foreshadow an infection or a cancerous transformation. MHC specifically binds antigens, and presents them to T cells, that in turn initiate adaptive immune response: against a virus or bacteria, but also against transplanted organs (hence ‘histocompatibility’ in the name), as they are also recognized as ‘non-self’.
It is so difficult to find a suitable donor, because the MHC is extremely polymorphic - numerous variants of those molecules are found in populations; thousands of them have been described in humans. Of course, the remarkable variability of the MHC did not evolve to make the life of transplantologists difficult – it is a result of an evolutionary arms race between hosts and pathogens. Novel or rare MHC variants are favoured, because pathogens rapidly adapt to evade recognition by the MHC alleles prevailing in a population. Another mechanisms maintaining polymorphisms stems from the perks of being a heterozygote – each MHC variant can only bind a limited spectrum of antigens, so having two different alleles, instead of just one, is a clear gain. It seems only logical that a higher within-individual MHC diversity (which could be achieved by gene duplication and diversification) should be similarly advantageous. Each individual could then recognize antigens coming from most possible pathogens. Surprisingly though, individuals usually possess just a few, functional loci, that can accommodate only a tiny fraction of the presumably adaptive diversity present at the population level.
An explanation of this apparent paradox was proposed over 30 years ago, as an evolutionary trade-off between an increased potential for pathogen recognition and mechanisms preventing autoimmune diseases. During maturation, T cells go through a process of negative selection, that removes cells strongly recognizing self-antigens bound to MHC. The more MHC variants one have, the more antigens (both self and non-self) it can present. In a result, more T cells could turn out to be self-reactive, and would have to be deleted to prevent autoimmunity. It is a dire scenario, as holes in the T cell repertoires can be very dangerous to one’s health – a striking example is AIDS, a deadly disease that is characterized by a loss of large portions of T cells.
This elegant explanation – later dubbed as a T-cell receptor (TCR) depletion hypothesis – was widely accepted; even though some mathematical models disputed it, and indirect tests of this hypothesis (correlating number of MHC genes with immunocompetence and/or fitness) yielded mixed results. What was conspicuously missing, was a direct test that could answer the most basic question: do individuals with more MHC variants have smaller TCR repertoire sizes? Only recently, with the advent of the next generation, high-throughput sequencing, the problem had become technically tractable - and turned into the subject of my PhD thesis. My work was a part of a large project concerning evolution of number of MHC genes, led by prof. Jacek Radwan at Adam Mickiewicz University in Poznan, Poland. I was thrilled by the idea of testing this long-standing, evolutionary hypothesis - yet, it soon turned out, that it was easier said than done.
We needed to look outside of the realm of model organisms, to find a species characterized by a high, between-individual variation in the number of MHC genes – an ideal feature to test the hypothesis at question. We found it in a small rodent, the bank vole (Myodes glareolus). Yet, use of a non-model species meant that I needed to adapt and optimize molecular protocols available for mouse and humans, so that they would work with our voles.
During multi-locus MHC typing, we had to tell genuine variants from sequencing artefacts, and there was no reference database to guide us. Our bioinformatician, Alvaro Sebastian, did a great job developing software: AmpliSAT suite, that could handle amplicon sequencing data from non-model species and sieve out artefacts. It worked nicely for MHC, but the complexity of TCR repertoires called for novel ways to deal with sequencing errors – unique molecular identifiers (UMIs). Yet, at first, they only led to failed sequencing runs and even more troubleshooting... A lengthy investigation revealed the source of our troubles: addition of UMIs somehow changed a previously well optimized reaction, and generated abundance of short, non-specific products - entangled in our sequencing library in a way that made them invisible in gel electrophoresis. I learned the hard way, that while wit and acumen are useful when doing research, grit and perseverance are obligatory.
Finally, we got our answer – and it was not exactly what the TCR depletion hypothesis have predicted! We showed that more variants of MHC class I (that presents intracellular pathogens like viruses), but not of MHC class II (that presents extracellular pathogens: most of bacteria, fungi, parasitic warms), correlated with the smallest TCR repertoire size. While we were glad to provide a partial support for the tested hypothesis, it was even more exciting to discover something that could have not been predicted by the mathematical models, and was missed by the indirect tests. We still don’t know what caused the disparity, but maybe this is the most inspiring part – to open up new directions for research. To check out the details of our work and suggestions why the two MHC classes could differ – look up our new PNAS article!