Scientists are probably centuries away from drawing the full tree of life. For one thing, they have only discovered a small fraction of the species on Earth–perhaps only ten percent. They are also grappling with the relationships between the species they have discovered. Systematists (scientists who study the tree of life) rely mainly on DNA these days to figure out how species are related to one another. They compare the similarities and differences in a given gene in several different species to figure out which ones share the closest kinship. But they have actually sequenced DNA from relatively few species. And in many cases, that DNA may come from a single gene.
Systematists have made good use of this scanty data. They’ve been able to sort out relationships of many big groups of species, from mammals to plants–groups that sparked debate among systematists for decades. But these are just small tufts on the complete tree of life. The big picture has proven harder to pull into sharp focus.
When systematists try to make sense of billions of years of evolution, they must struggle with many foes. The DNA they study, for example, may send them a misleading signal. Some of the most common mutations are known as point mutations, because they change DNA at one point–changing a single base of DNA to another. Since there are are only four letters in the alphabet of DNA, it’s not surprising that over billions of years two lineages may acquire the same letter at the same position in the same gene. These two independent mutations may well give the illusion that two lineages share a close common ancestry.
Genes create another challenge when they jump from one species to another. For decades scientists have known that microbes can swap genes, primarily with the help of viruses that sometimes move between species, carrying some host DNA with their own. At first this hopping seemed like rare flukes. Then some systematists argued that gene swapping was so common that life’s history might be better represented by a web than a tree. Most experts disagree: they argue that this gene-swapping does not destroy the quest for the tree of life. It creates vines draped between the branches of the tree of life, but the branches of the tree are still visible.
It’s been hard to resolve this debate because until now most scientists have analyzed the tree of life by looking at just one gene in a number of species, or, in rare cases, a few genes. Fortunately, scientists now have entire genomes of a couple hundred species to analyze. In the new issue of Science, biologists at the European Molecular Biology Laboratory published the latest, most thorough glimpse at the tree of life.
It’s quite something to behold. I’ve posted a reduced version of the tree on this page, and you can get a closer look here, at a site dedicated to the project. To orient yourself, our species is at about two o’clock, next to the chimp, rat, and mouse.
The scientists took advantage of the fact that so many genomes have been sequenced over the past decade, and that it’s now possible to compare the DNA in different genomes relatively quickly (if you have a supercomputer, of course). Their strategy was to search for all the genes that could offer the clearest clues to the tree of life–genes that had not been swapped too much between species, for example. They searched the genomes of 191 species of animals, plants, fungi, protozoans, bacteria, and archaea (microbes that look superficially a lot like bacteria). They selected 36 universal genes, but then tossed out five of them because they appear to have been swapped.
This tree emerged from their analysis of the remaining 31 genes. The scientists kicked the tires, as it were, by running the tree through a series of statistical tests. Did the same pattern of branches emerge if they left out some species? What happened if they left out one gene or another from the analysis? Two-thirds of the branches turn out to have 100% support from these tests, and many of the others, while not so perfect, are still statistically robust. So this study suggests that gene-swapping does not end the quest for the true tree of life. (Other scientists came to a similar conclusion last year, which I wrote about here.)
Here’s a quick tour of the tree. Start at middle of the circle. The central point represents the last common ancestor of all living things on Earth. The tree sprouts three deep branches, which between them contain all the species the scientists studied. These deep branches first came to light in the 1970s, and are known as domains. We belong to the red domain of Eukaryota, along with plants, fungi, and protozoans. Bacteria (blue) and Archaea (green) make up the other two domains.
These lineages probably split very early in the history of life. Fossils of bacteria that look much like living bacteria turn up at least 3.4 billion years ago. Just a few lineages became multicellular much later, with some algae getting macroscopic about two billion years ago.
The length of the branches on this tree represent so-called genetic distance. The longer the branch, the more substitutions have accumulated in its genes. Since these genomes all come from living species, the branches all span the same period of time. The fact that some branches are long and some are short means that some lineages have evolved more than others. Many forces can stretch out genetic distance. A species may reproduce fast, or it may have a life that makes it prone to acquiring more mutations. The slash in the Bacteria branch represents a segment that the scientists left out to make the full tree easier to see.
The long length of the Bacteria branch underscores one of the big messages of this tree: the diversity we can see with the naked eye reflect a pretty paltry snippet of life’s genetic diversity. Humans and mushrooms are tucked into a small part of the tree. Meanwhile, bacteria such as ones that cause strep throat (Streptococcous) and the ones that cause food poisoning (Salmonella) are divided by vast evolutionary gulfs. The diversity of microbes did not stop evolving billions of years ago. Escherichia coli, for example, emerged relatively recently, specializing on the warm guts of mammals and birds.
As the scientists point out, this tree challenges the traditional way that biologists classify living things into species, genus, family, class, phylum, and kingdom. Scientists named many of these groups in the eighteenth and nineteenth centuries, when they could only sort species by what they could see with the naked eye or a crude microscope. But there’s a vast amount of hidden biochemical diversity in living things, and that diversity is reflected in this new tree. The scientists compared the genetic distance among different groups. Animals in different phyla are separated by much less genetic distance than bacteria that are in the same phylum. If scientists were classifying life from scratch today based on genetic distance, they’d probably downgrade animal phyla to classes.
This discovery does not sit well with claims from creationists that evolution cannot account for the emergence of animal phyla. It is certainly true that the earliest fossils of several animal phyla emerge over a span of perhaps thirty million years around the beginning of the Cambrian period, 540 million years ago. But animal phyla are, in a sense, overrated. This new tree of life supports a growing consensus that relatively small genetic changes in animal evolution led to big changes in their bodies. (If you want to read a whole book on this, check out Endless Forms Most Beautiful.)
This tree supports some findings from other recent studies. Mushrooms are more closely related to us than they are to plants, for example. But it will also make some scientists unhappy in other ways. There’s a big debate going on these days about the animal kingdom. Some researchers think that arthropods and nematodes belong to a “moulting” group. But this tree suggests that arthropods are more closely related to us vertebrates. The authors of the new study acknowledge that their tree may be unreliable in this respect. That’s because the animal genomes that have been sequenced may not belong to the best species to include in this sort of study. The fruit fly Drosophila melanogaster or the mouse Mus musculus did not get their genomes sequenced so they could be put in the tree of life. They were chosen because scientists had studied their genes and physiology for decades. They would be able to make good use of the genomes of these animals for their research. It would help enormously if scientists could get the genomes of other animals that belong to different branches of the animal kingdom, such as ragworms and other obscure critters.
Fortunately, scientists should be able to add these species to this tree very quickly. In the past, scientists have had to do a lot of their tree-building by hand, lining up genes, identifying cases of gene-swapping, and so on. But as the European scientists built this new tree, they were able to set up an automated pipeline. As new genomes are published, it will be possible to let a computer automatically compare them to older sequences and generate a new tree that does a better job of explaining all the evidence. None of us may live to see the full tree of life emerge, but at least we may be able to savor a better sneak preview.
Update, 3/5 9:45 am: Rhasgobel has put together a useful list of translations of the Latin names on the tree.
Originally published March 3, 2006. Copyright 2006 Carl Zimmer.