Since the very beginning of life, organisms have needed a way to store and pass on information. Current knowledge suggests that the first form of genetic information was stored in ribonucleic acid (RNA). Over time, life invented DNA, a chemically more stable molecule, for long-term information storage and faithful inheritance across generations. This genetic information encodes the instructions for all essential cellular processes. To preserve and transmit this information, life evolved a replication machinery to copy DNA. But surprisingly, although DNA replication performs the same function in all organisms, the machinery itself does not originate from a single ancestral source. Instead, DNA replication appears to be a striking case of convergent evolution, where life independently evolved similar solutions more than once.
What is DNA replication?
A meme I often see on social media says: “Cells multiply by dividing.” This is scientifically accurate but incomplete, before a cell can divide, it must ensure that each daughter cell receives a complete copy of its DNA.
To achieve this, cells must replicate their DNA before division. This is a complex process involving various proteins, including: DNA polymerases, which synthesize the new DNA strand, Helicases, which unwind the DNA double helix to expose the template strands, and Primases, which lay down short RNA primers to give DNA polymerases a starting point.
By copying the genetic code, cells ensure that biological information is passed from one generation to the next. At the same time, mutations that occur during replication provide the raw material for evolution. These changes introduce variability on which natural selection can act, driving the long-term evolution of life.
DNA Replication: A Tale of Two Worlds
Life is thought to have originated from a Last Universal Common Ancestor (LUCA). After LUCA, life diverged into two major branches: Bacteria and Archaea. Later, a symbiotic event occurred in which an archaeal cell engulfed a bacterium. The bacterium became an endosymbiont, and over time, this relationship gave rise to a third branch: the Eukaryotes. Unlike their prokaryotic relatives, eukaryotes evolved compartmentalized cells, with membrane-bound organelles that separate different biological processes.
Given this shared ancestry, one might expect that all core processes – DNA replication, transcription, and translation – were already present in LUCA and simply diverged afterward. This is true for transcription and translation: their molecular machinery is deeply conserved across all domains of life. But DNA replication is the exception.
A Case of Convergent Evolution
Empirical evidence strongly suggests that DNA replication evolved independently at least twice. The key enzymes involved in the process differ significantly between Bacteria and Archaea/Eukarya. In Bacteria, the main DNA polymerase is from the Pol III family, while in Archaea and Eukaryotes, it’s from the B-family. The replicative helicase is DnaB in Bacteria, and the MCM complex in Archaea and Eukaryotes, two unrelated proteins. The primases used to initiate replication also differ in structure and evolutionary origin between these groups.
This divergence suggests a remarkable fact: even the most fundamental biological processes can exhibit evolutionary flexibility.
How Is This Even Possible?
There are several hypotheses for how DNA replication could have evolved more than once.
1. Independent Invention After the Origin of DNA Genomes
One hypothesis proposes that after LUCA, the DNA replication machinery was invented independently in Bacteria and in the common ancestor of Archaea and Eukaryotes. This explains why the core proteins differ so dramatically.
This idea builds on the RNA World Hypothesis, which proposes that early life used RNA as both genetic material and catalyst. Later, DNA evolved as a more stable alternative. If the switch to DNA happened independently in different lineages, it makes sense that each would evolve its own replication machinery to handle the new molecule.
2. The Viral Origin of DNA Hypothesis
Another theory, known as the Viral Origin of DNA Hypothesis (Forterre, 2006; Villarreal, 2006), proposes that DNA replication machinery first evolved in early DNA viruses. These viruses infected RNA-based cells and introduced DNA as their genome.
Host cells may have later adopted viral DNA and its replication enzymes, switching from RNA to DNA-based genomes. If different viral lineages infected Bacteria, Archaea, and Eukaryotes, this could explain the distinct DNA replication systems observed in each domain.
Conclusion: Functional Convergence at the Molecular Level
DNA replication is a stunning example of functional convergence, where organisms with different evolutionary origins evolve similar solutions to the same problem. But unlike wings, eyes, or fins, this convergence happened at the molecular level, deep in the cell’s most fundamental machinery.
It challenges the idea that all essential life processes must be inherited from a common ancestor. In this case, life reinvented one of its most vital processes – twice.
References:
Koonin, E. V., Wolf, Y. I. & Aravind, L. Comparative genomics of eukaryotes and prokaryotes: functional and evolutionary implications. Nucleic Acids Res. 28, 3117–3130 (2000).
Forterre, P. DNA before proteins? The viral hypothesis and the origin of DNA. Cell. Mol. Life Sci. 63, 365–381 (2006).
Image from iStock by getty images