The New York Times, January 18, 2021 (with Jonathan Corum)


At the heart of each coronavirus is its genome, a twisted strand of nearly 30,000 “letters” of RNA. These genetic instructions force infected human cells to assemble up to 29 kinds of proteins that help the coronavirus multiply and spread.

As viruses replicate, small copying errors known as mutations naturally arise in their genomes. A lineage of coronaviruses will typically accumulate one or two random mutations each month.

Some mutations have no effect on the coronavirus proteins made by the infected cell. Other mutations might alter a protein’s shape by changing or deleting one of its amino acids, the building blocks that link together to form the protein.

Through the process of natural selection, neutral or slightly beneficial mutations may be passed down from generation to generation, while harmful mutations are more likely to die out.

Mutations In the B.1.1.7 Lineage

A coronavirus variant first reported in Britain has 17 recent mutations that change or delete amino acids in viral proteins.

The variant was named Variant of Concern 202012/01 by Public Health England, and is part of the B.1.1.7 lineage of coronaviruses.

Notable mutations in the B.1.1.7 lineage are listed below. Six other mutations, not shown in the diagram above, do not change an amino acid.

Eight Spike Mutations

Researchers are most concerned about the eight B.1.1.7 mutations that change the shape of the coronavirus spike, which the virus uses to attach to cells and slip inside.

Each spike is a group of three intertwined proteins:

Building one of these spike proteins typically takes 1,273 amino acids, which can be written as letters:


Spike proteins in the B.1.1.7 lineage have two deletions and six substitutions in this sequence of amino acids.

Written as letters, a B.1.1.7 spike protein looks like this:


These mutations alter the shape of the spike protein by changing how the amino acids fold together into a complex shape.

The Spike N501Y Mutation

Scientists suspect that one mutation, called N501Y, is very important in making B.1.1.7 coronaviruses more contagious. The mutation’s name refers to the nature of its change: the 501st amino acid in the spike protein switched from N (asparagine) to Y (tyrosine).

The N501Y mutation changes an amino acid near the top of each spike protein, where it makes contact with a special receptor on human cells.

Because spike proteins form sets of three, the mutation appears in three places on the spike tip:

In a typical coronavirus, the tip of the spike protein is like an ill-fitting puzzle piece. It can latch onto human cells, but the fit is so loose that the virus often falls away and fails to infect the cell.

The N501Y mutation seems to refine the shape of the puzzle piece, allowing a tighter fit and increasing the chance of a successful infection.

Researchers think the N501Y mutation has evolved independently in many different coronavirus lineages. In addition to the B.1.1.7 lineage, it has been identified in variants from Australia, Brazil, Denmark, Japan, the Netherlands, South Africa, Wales, Illinois, Louisiana, Ohio and Texas.

In addition to N501Y, the B.1.1.7 has 16 other mutations that might benefit the virus in other ways. It’s also possible that they might be neutral mutations, which have no effect one way or the other. They may simply be passed down from generation to generation like old baggage. Scientists are running experiments to find out which is the case for each mutation.

The Spike H69–V70 Deletion

One mysterious mutation in the B.1.1.7 lineage deletes the 69th and 70th amino acids in the spike protein. Experiments have shown that this deletion enables the coronavirus to infect cells more successfully. It’s possible that it changes the shape of the spike protein in a way that makes it harder for antibodies to attach.

Researchers call this a recurrent deletion region because the same part of the genome has been repeatedly deleted in different lineages of coronaviruses. The H69–V70 deletion also occurred in a variant that infected millions of mink in Denmark and other countries. Scientists are beginning to identify a number of these regions, which may play an important role in the virus’s future evolution.

The Spike Y144/145 Deletion

In another recurrent deletion region, a number of coronavirus lineages are missing either the 144th or 145th amino acid in the spike protein. The name of the mutation comes from the two tyrosines (Y) that are normally in those positions in the protein.

Like the H69–V70 deletion, Y144/145 occurs on the edge of the spike tip. It may also make it harder for antibodies to stick to the coronavirus.

The Spike P681H Mutation

This mutation changes an amino acid from P to H on the stem of the coronavirus spike:

When spike proteins are assembled on the surface of a coronavirus, they’re not yet ready to attach to a cell. A human enzyme must first cut apart a section of the spike stem. The P681H mutation may make it easier for the enzyme to reach the site where it needs to make its cut.

Like N501Y, the P681H mutation has arisen in other coronavirus lineages besides B.1.1.7. But it’s rare for one lineage to carry both mutations.

The ORF8 Q27stop Mutation

ORF8 is a small protein whose function remains mysterious. In one experiment, scientists deleted the protein and found that the coronavirus could still spread. That suggests that ORF8 is not essential to replication, but it might still give some competitive edge over mutants that have lost the protein.

ORF8 is typically only 121 amino acids long:


But a B.1.1.7 mutation changes the 27th amino acid from Q to a genetic Stop sign:


When the infected cell builds the ORF8 protein, it stops at this mutation and leaves a stump only 26 amino acids long:

Researchers assume that this ORF8 stump cannot function. But if losing the protein leaves B.1.1.7 at a disadvantage, it’s possible that the advantages of another mutation like N501Y might make up for the loss.

Two other B.1.1.7 mutations appear in ORF8 after the stop point, changing R to I and Y to C:


Because the ORF8 protein is cut short, these two mutations may do nothing.

Detection and Spread

B.1.1.7 first came to light in the United Kingdom in late November. Researchers looked back at earlier samples and found that the first evidence dates back to Sept. 20, in a sample taken from a patient near London.

The B.1.1.7 lineage has now been detected in over 50 countries, including the United States. Britain has responded to the surge of B.1.1.7 with stringent lockdowns, and other countries have tried to prevent its spread with travel restrictions.

B.1.1.7 is estimated to be roughly 50 percent more transmissible than other variants. Federal health officials warn that it may become the dominant variant in the United States by March. It is no more deadly than other forms of the coronavirus. But because it can cause so many more infections, it may lead to many more deaths.

B.1.1.7 has been detected in at least 14 states, but the United States has no national surveillance program for determining the full extent of its spread.

How Did the Variant Evolve?

A number of researchers suspect that B.1.1.7 gained many of its mutations within a single person. People with weakened immune systems can remain infected with replicating coronaviruses for several months, allowing the virus to accumulate many extra mutations.

When these patients are treated with convalescent plasma, which contains coronavirus antibodies, natural selection may favor viruses with mutations that let them escape the attack. Once the B.1.1.7 lineage evolved its battery of mutations, it may have been able to spread faster from person to person.

Other Mutations in Circulation

One of the first mutations that raised concerns among scientists is known as D614G. It emerged in China early in the pandemic and may have helped the virus spread more easily. In many countries, the D614G lineage came to dominate the population of coronaviruses. B.1.1.7 descends from the D614G lineage.

A more recent variant detected in South Africa quickly spread to several other countries. It is known as 501Y.V2 and is part of the B.1.351 lineage. This variant has eight mutations that change amino acids in the spike protein. Among these mutations is N501Y, which helps the spike latch on more tightly to human cells.

None of these variants are expected to help the coronavirus evade the many coronavirus vaccines in clinical trials around the world. Antibodies generated by the Pfizer-BioNTech vaccine were able to lock on to coronavirus spikes that have the N501Y spike mutation, preventing the virus from infecting cells in the lab.

Experts stress that it would likely take many years, and many more mutations, for the virus to evolve enough to avoid currentvaccines.

Copyright 2021 The New York Times Company. Reprinted with permission.