Trace your genealogy back 25 million years, and you’ll meet long-tailed monkey-like primates living in trees. Those primates were not just the ancestors of ourselves, but of all the other apes–chimpanzees, bonobos, gorillas, orangutans, and gibbons–along with the monkeys of the Eastern Hemisphere, such as baboons and langurs. By comparing ourselves to these other primates, scientists can get clues to our evolution over the past 25 million years. Until now, most of those clues have come from fossils and studies on the behavior and physiology of apes and monkeys. But in the past few years scientists have begun to pore over a new record: the one that is inscribed in our genome and the genomes of other apes and monkeys.

The first draft of the human genome was published in 2000, and in 2005 came the genome of the chimpanzee–our closest living relative. Scientists compared the two genomes to get a sense of what the genome of our common ancestor looked like, and how the genomes of both species have changed over the past few million years. (I wrote about the first wave of chimp/human studies here). One of the biggest surprises came when one team of researchers concluded that the ancestors of chimpanzees and humans interbred for over a million years, producing hybrid humanzees.

But there’s a limit to how much you can learn from just two genomes. If you find two versions of a gene that are nearly–but not quite–identical in humans and chimpanzees, it’s hard to know for sure how that difference evolved. Imagine, for the sake of brevity, that the human version of a gene is AAAT, and the chimpanzee version is AAAC. (Real genes are hundreds or thousands of nucleotides long.) It’s possible that the ancestor of humans and chimps had the AAAC version of the gene, and in humans the C mutated to T. But it’s also possible that humans have the ancestral version, and in chimps the T flipped to C. It’s even possible that the ancestral version was neither. It might have started out as AAAA, and in humans the final A became T and in chimps A became C.

The way through this impasse is to compare chimpanzees and humans to a third species–ideally another primate. But it was not until today that scientists had a third primate genome to study. Now they have all the DNA from a macaque.

There are 22 species of macaque in the world, their natural ranges reaching as fast west as Gibraltar and as far east as Japan. They’re tough, adaptable monkeys that can be found living in cities and temples. Scientists have long studied macaques to learn things about ourselves. The Rh factor in blood types is short for Rhesus factor, discovered in rhesus macaques. It was the macaque’s special role in science that put its genome near the top of the list to be sequenced. An inventory of every gene in the macaque genome would make the monkeys even more useful models for human biology.

At the same time, the macaque genome promised to bring human evolution into sharper focus. Humans, chimpanzees, and macaques share an ancestor that lived 25 million years ago. Imagine that you discover that the gene I just mentioned is AAAC in macaques. The simplest explanation for the three versions of the gene is that the ancestor had AAAC, which macaques and chimpanzees inherited. Only in humans did it flip to AAAT. Now imagine that you can make this sort of judgment on all of the roughly 18,000 genes in the human genome.

The macaque genome team has published three papers in the journal Science, along with a dedicated web site. The papers are the latest in a long series of papers that show how intimately intertwined evolutionary biology and medical research have become (despite unfounded claims to the contrary). You just need to look at the title of the lead paper: “Evolutionary and biomedical insights from the rhesus macaque genome.”

The scientists examine a lot of biology in the papers, but four topics really jump out: ancestral genes involved in diseases, fast-evolving genes, the origin of new genes, and the spread of genome parasites. Below the fold, I’ll hit them one at a time…

1. Ancestral genes and diseases. Some people carry versions of genes (known as alleles) that either cause diseases or predispose them to getting sick. When scientists studied the chimpanzee genome, they discovered that chimpanzees carry the human disease allele, without suffering the human disease. The macaque genome team expanded this search, looking for matches in chimp and macaque genomes to every known human disease allele. They discovered 229 cases in which the disease allele turns out to be the ancestral version. Some of these alleles are quite nasty. Some cause severe mental retardation. Others cause potentially fatal defects in metabolism. Healthy people, for example, convert a chemical called phenylalanine into another one called tyrosine in the process of building proteins. But a genetic defect will stop this transformation, causing phenylalanine to build up to dangerous levels–a condition called phenylketonuria. Macaques have the phenylketonuria gene. And yet they are not poisoned by phenylalanine.

This paradox is not so paradoxical if you bear in mind that no gene works alone. Genes work in networks with dozens of other genes. These entire networks evolve over time, as natural selection fine-tunes the genes to work together successfully. But if conditions change, alleles that worked well in those networks may start to work badly, and new versions of the genes will be favored by natural selection. In the case of 229 genes, it appears, we are in that awkward intermediate stage, with obsolete alleles still in circulation.

2. Fast-evolving genes. The genomes of humans and macaques are, on average, 93 percent identical. About eleven percent produce identical proteins. Others differ by just a few amino acids, and others by dozens. Natural selection could have driven some of those differences, but so could luck. Even neutral mutations sometimes become widespread through nothing more than chance. Comparing genomes helps scientists tell these changes apart.

By reconstructing the ancestral genome of macaques, chimpanzees, and humans, scientists can tally up the changes along each branch. They can identify mutations that cause significant changes to the structure of proteins, as well as mutations that leave the protein unchanged. These techniques even let scientists distinguish between genes that have experienced strong selection from ones that have experienced weak selection. Previous studies on chimpanzees and humans identified 35 genes that experienced strong selection. Adding the macaque genome into the analysis sharpens up the picture considerably. The macaque genome team has identified 178 genes.

Many of these fast-evolving genes help build the immune systems. That’s not too surprising, given the grave threat of disease and the swift evolution of parasites. Among the surprise entries are two genes that help build hair. Apes and Old World monkeys might have undergone strong selection on these genes as they adapted to changes in climate or to attract mates with a handsome coat. And some of these genes are best known for making proteins in cancer cells. In order to explain that particularly weird finding, I have to jump into the next topic…

3. The origin of new genes. Comparing genes from one species to another can cause serious headaches. That’s because it’s hard to find a clean, one-to-one correspondence. Many genes in the human genome belong to gene families–groups of dozens, even hundreds of genes that have very similar sequences. Other primates have gene families as well, but some of their families have more genes than ours, and some have fewer. The new macaque genome study does a great job of demonstrating how this confusing situation came to be. Over the course of millions of years, genes get accidentally duplicated. Some copies are later lost, and some evolve into new forms.

The macaque genome team took a close look at a particularly interesting family of genes called PRAME genes. Humans have dozens of PRAME genes, with some people having more copies than others. Scientists aren’t exactly sure what PRAME genes do, but it seems they have a role in producing sperm, judging from the fact that they normally only make proteins in testes. But PRAME genes also have a dark side: they often switch on in cancer cells. There are actually many genes that play this dual role, so many in fact that they have their own name: cancer testes genes. There’s something very useful to cancer cells in the genes used to build sperm–possibly their ability to grow and divide quickly. (For more on this connection, see my article in the January issue of Scientific American on cancer and evolution.)

The macaque team used the monkey’s genome to trace the evolution of PRAME genes. Their results are summed up nicely in this picture. Each line represents a gene, and the three clusters of genes belong, from top to bottom, to macaques, chimps, and humans.

The PRAME gene family started with a single gene which was accidentally duplicated. The two new genes then became four, and four became eight. These three rounds of duplication had already taken place before the ancestors of macaques, humans, and chimps diverged 25 million years ago. All eight kinds of PRAME genes can be found in their genomes. But evolution did not then grind to a halt. In the new primate lineages, some PRAME genes disappeared thanks to mutations that snipped them out of genomes. In other cases, PRAME genes were duplicated yet again, expanding the family.

But these new genes were not just extra copies of old ones. They acquired mutations to their sequences that changed the shape of the proteins they made. In some cases, the selection acting on these genes was intense. It’s possible that as these genes evolved for their function in making sperm, they became more effective as cancer genes. Understanding that evolution may help scientists understand how cancer cells benefit from them.

4. Genomic parasites. Comparing primate genomes shows that primates have been particularly prone to picking up duplicated genes. One reason for this may be that our genomes are overrun with self-replicating segments of DNA called mobile elements. When they copy themselves, they sometimes copy ordinary genes as well.

Mobile elements are a motley collection of weird chunks of DNA that take up about half of the human genome. Many of them got their start from viruses that invaded ancient primates. Known as endogenous retroviruses, they are related to HIV. But while HIV jumps from one host to another, endogenous retroviruses fused with a host genome and were passed down from one generation to the next. Their DNA mutated over time, but in some cases they still retained the ability to make copies of themselves that were then inserted back into the genome. Other mobile elements cannot replicate themselves but depend instead on these more viral elements to do the work for them. Parasites of parasites, as it were.

Mobile elements are particularly tough to survey. Protein-coding genes usually have some very clear markers–instructions that tell the cell when to stop copying them, for example–that computers can search for. Mobile elements, on the other hand, can become obscured by mutations. By comparing several different versions of the same mobile element, scientists can recognize them more easily. The macaque genome allows scientists to do exactly that. It turns out that the ancestor of macaques, chimps, and humans had already been infected by seven different endogenous retroviruses. After the three branches split, the viruses made new copies of themselves. In the human lineage, only two viruses at most have infected us, and both have become trapped in our genome. (Last fall I wrote about how researchers resurrected one of them.) Macaques, by contrast, have been infected by eight viruses since their ancestors split off from our own. I wonder if that difference reflect a difference in how many viruses each species faces. And given that we got HIV from chimpanzees and monkeys, it would help to understand the relationship between primates and their viruses better.

Mobile elements are also important medically because when they make new copies of themselves, they can wreak havoc in the genome. They can pick up parts of genes or entire genes and moved them to other places in the genome. Genes that are normally kept silent can become active; they can respond to new signals. Scientists have linked 118 genetic disorders to mobile elements, including hemophilia, breast cancer, and muscular dystrophy. The macaque genome team points out that understanding their evolution in humans and macaques will make the monkeys better models for these diseases. But they also observe that mobile elements can be a creative force as well. They create new copies of genes, which can take on new functions. They can insert parts of genes into other genes, in a kind of natural genetic engineering that can generate new kinds of proteins. Once again, medicine and evolution prove intertwined. Thus the macaque genome helps us better understand the mysterious encyclopedia of DNA we carry inside us.

(For more information check out an interactive site Science has set up here.)

[Macaque image courtesy of the Southwest National Primate Research Center at Southwest Foundation for Biomedical Research in San Antonio] 

Originally published April 12, 2007. Copyright 2007 Carl Zimmer.