Wired, January 17, 2013
On September 19, 2011, Evan Snitkin sat staring at a computer monitor, its screen cluttered with Perl script and row after row of 0s sprinkled with the occasional 1. To Snitkin, a bioinformatician at the National Institutes of Health, it read like a medical thriller. In this raw genetic-sequencing data, he could see the hidden history of a deadly outbreak that was raging just a few hundred yards from where he sat.
Snitkin was, in a sense, a medical historian: a genetic epidemiologist who traced the paths of disease outbreaks. But now, for the first time in history, he was trying to use his genetic toolkit to reroute an outbreak while it was in progress—and before it turned disastrous. A few weeks earlier, a handful of patients at the NIH Clinical Center, a 243-bed research hospital on the NIH campus in Bethesda, Maryland, had been hit by a vicious strain of bacteria known as KPC. Shorthand for carbapenem-resistant Klebsiella pneumoniae, KPC can hitch a ride on healthy people, setting up residence on their skin. From them it can spread to people with weak defenses—like hospital patients—and bloom into an overwhelming infection that spreads via the bloodstream into the whole body, swiftly shutting down one organ after another. In the past decade, KPC has evolved the ability to withstand every known antibiotic. As a result, roughly half of people who develop an active infection of KPC will die.
This nasty bacterium had arrived in the Clinical Center for the first time in June 2011. A woman was being transferred from a New York hospital, and her admitting nurse noted in her medical history that she was colonized with the bacteria. On her arrival, doctors kept her isolated from other patients to prevent the KPC from spreading, and on July 15 she was discharged. But on August 5 another patient—a man who had been in the hospital for many weeks—tested positive for KPC. And then, roughly every week after that, another new infection cropped up.
As the cases mounted, the Clinical Center gathered as much epidemiological data as it could to figure out the nature of the outbreak. But there wasn’t enough information to create a clear picture. The most befuddling part was the second patient, who became ill long after the first patient had left the hospital—an uncommon onset for this particular bacterium. This second patient died under the ravages of KPC, and there was little that hospital staff could do for the sick.
Such illnesses are becoming more common worldwide. With each passing year, the problem of superbugs—bacteria such as Klebsiella that have evolved resistance to all, or nearly all, antibiotics available—has grown progressively more dire. Gone are the days when pharmaceutical companies could roll out generation after generation of new medications to replace those that bacteria had already surmounted. Such drugs have become much harder to find; and even when they are found, the market for them is far less lucrative than for molecules that combat such high-profile killers as cancer or AIDS. As a result, the flow through the antibiotics pipeline has slowed to a trickle. From 1983 through 1987, the FDA approved 16 new systemic antibiotics; from 2008 through 2011, it approved just two. Rather than administer some new wonder drug, then, the Clinical Center could only quarantine these KPC-positive patients and give them harsh drugs like colistin, an antibiotic so toxic it was all but abandoned in the 1970s. An estimated 90,000 people die every year from infections they acquire in US hospitals—more than the number that die from Alzheimer’s, diabetes, or influenza.
In late August, as word of the outbreak circulated among the NIH staff, Snitkin and his boss, Julie Segre, approached the Clinical Center with an unusual offer. In their jobs at the NIH’s National Human Genome Research Institute, the two scientists had previously sequenced genomes from a bacterial outbreak long after it had died out. But today, sequencing technology had become so fast and so cheap. Why not analyze the bacteria in the middle of an outbreak? By tracking the bug’s transmission route through the hospital, they might be able to isolate it and stop its lethal spread. They put this question to the center’s top brass, who immediately accepted their offer. “It was a no-brainer,” says Tara Palmore, the center’s deputy epidemiologist, who headed up its fight against KPC.
It took nearly a month to retrieve the first results, and now Snitkin was finally navigating through millions of base pairs on his computer screen. The 0s he saw were bases of DNA that were identical in every KPC microbe he was studying. The 1s represented mutations in each microbe not found in the others. By comparing the mutations, Snitkin could see how the bacteria were related to one another. If the history of public health has until now been embodied by the map—as in British physician John Snow’s famous map, which allowed him to curb the London cholera outbreak of 1854 and to found, in doing so, the modern field of epidemiology—Snitkin was embarking on a new kind of epidemiology: one founded on the phylogenetic tree.
And the tree that Snitkin drew was profoundly different from what Palmore had expected. “They showed me the results,” she says, “and I was speechless.”
Evan Snitkin, 31, is a tall, soft-spoken New Yorker with close-cropped hair. When he talks about his stint fighting the hospital outbreak, he strikes the slightly disoriented tone of a country doctor describing his deployment to the front lines of a war. Snitkin, after all, isn’t a physician; he isn’t even a microbiologist. Fundamentally he’s a computer programmer, not just by profession but by training. After finishing his undergraduate degree in computer science at SUNY Binghamton, Snitkin went to Boston University for a PhD in bioinformatics, the science of writing algorithms that parse biological data sets. In his graduate work, he toiled over puzzles far from the hospital bedside. His dissertation, for example, examined how genes work together in metabolic processes; he developed algorithms that could analyze experiments on yeast genes and transform their results into maps of their Facebook-like networks.
Although Snitkin found this research intellectually satisfying, he wanted to apply his programming skills to more viscerally important questions. And so he left Boston University for Maryland, where he took a job with Julie Segre at the National Human Genome Research Institute. While finishing a PhD in genetics in the 1990s, Segre worked on the earliest stages of the Human Genome Project before moving to NHGRI. At first her research there focused on the genes that help build skin; after a few years, her interests shifted to the bacteria that live on human skin, both the species that promote health and the invaders that make us sick.
In 2008 Segre reached out to the staff at the Clinical Center, which sits just a stone’s throw from NHGRI at the heart of the NIH campus just inside the Beltway, to find out whether any gene-sequencing projects might be useful to them. They replied that they wanted to see the DNA of the microbes causing hospital outbreaks. In 2007, for example, a resistant strain of a bacterium called Acinetobacter baumannii had swept into the Clinical Center, infecting 18 people and killing five. Yet the doctors at the hospital knew very little about the genetic makeup of the microbe or how it had gotten into the hospital and spread among the wards.
So when Snitkin arrived at the institute, Segre appointed him as a kind of genomic historian: She asked him to sequence Acinetobacter genomes from the outbreak to see what they could learn from them. Snitkin threw himself into the job with an enthusiasm that surprised even himself. He began to spend much of his time in the clinical microbiology department of the hospital, learning how to thaw out Acinetobacter and cultivate it in a flask—and how to be sure he wasn’t growing bacteria that had come from his own fingers.
Three years later, as KPC gained its foothold in the Clinical Center, Snitkin and Segre saw their chance to act. “We already had all the relationships and the technology in place,” Segre says. But they needed to learn how to sequence with speed. As medical historians, Segre and Snitkin could experiment with new techniques for analyzing genomes, even if those techniques didn’t end up yielding much useful information. Now they had to retool their whole approach. No longer were they sequencing just for intellectual interest; it had become a question of life and death.
“We couldn’t tell whether we had two separate introductions or if it was all transmitted down the line,” Palmore says. “We only had the order in which we identified it.”
The first step was to pick the right machine. The National Institutes of Health has an in-house facility for decoding genomes: the NIH Intramural Sequencing Center, or NISC for short. A few miles from the NIH campus, occupying most of a floor in a serenely bland office building, NISC encompasses decades of gene-sequencing technology, from the old-school Sanger sequencers that helped decode the human genome in the 1990s to an Illumina HiSeq 2000, which is the size of a refrigerator and can sequence 2 billion bases per hour.
Snitkin and Segre needed speed, but they couldn’t sacrifice accuracy. Since the bacteria they were going to sequence came from four different patients, the two geneticists could, in theory, chart the path of contagion based on subtle mutations in the microbes. But to see those mutations, Snitkin and Segre would need impeccable accuracy. They couldn’t tolerate even a few mistakes in their data. The best machine for that job, they decided, was a device from 454 Life Sciences, a subsidiary of Roche. NISC’s scientists had been using their 454 machine for three years; while the newer machines might eventually be faster or cheaper, they weren’t yet as reliable.
The 454 machine works by sequencing fragments of DNA, which can then be pieced together to form an entire genome. Scientists first pull apart the twin strands of a DNA fragment and allow the machine to use each as a template for building a new strand, one nucleotide at a time. The 454 machine plugs one type of base—say, guanine—into the template and waits to see if it attaches to a specific position on the DNA. If it does, the reaction produces a flash of energy. If the DNA doesn’t give off a flash, the machine knows that guanine is the wrong base for that position. The 454 machine switches from base to base until it finally finds one that delivers a flash. Then it moves on to the next position in the DNA. Recording the flashes from hundreds of thousands of fragments at a time, a 454 machine can sequence millions of bases in an hour.
Once the 454 machine has determined the sequence of all the fragments, an assembler—software running on a desktop computer—has to work a kind of biological jigsaw puzzle to reconstruct longer sequences. The fragments come from overlapping regions of the genome, so the sequence at the end of one fragment matches the beginning of another. Using statistical methods, the assembler can link these two fragments into a longer stretch of DNA and then look for a third sequence that overlaps either end.
Stalking a Superbug
To track the deadly outbreak, NIH’s Evan Snitkin sequenced the bacteria from each patient, using genetics to figure out who had infected whom—and when. How did he do it? With a chart like this, which highlights the spots in each bacterial genome that differ among patients’ bacterial strains. (The filled dots are mutations.) Looking at the patterns of change, he could read the outbreak like a family tree.
When Snitkin started looking for clues, he had data from just four patients—labeled Patients 1 through 4 above. In Patient 1, for example, the four red dots represent four mutations not shared among all the patients.
Snitkin’s first question: Did all the cases stem from Patient 1 (who was known to be colonized with the bug upon arrival at the hospital), or were there outside sources? The genomes from Patients 2 through 4 varied from 1’s by only a few nucleotides, meaning Patient 1 must have been the source.
Patient 2 was the first to test positive after Patient 1. But the genetic information told a different story: His bug had an extra mutation that could only have appeared if he’d been infected after Patient 3.
Early in the outbreak, as more DNA came in, Snitkin mulled over an explanation for why mutations in Patients 2, 3, and 5 didn’t match those in Patient 1. Only later, when he sequenced other samples—from Patient 1’s lungs and throat—did the true reason become clear: Patient 1 harbored multiple strains of bacteria. Patient 3 had caught the strain from Patient 1’s throat.
As cases continued to crop up, Snitkin added more branches to his phylogenetic tree. Because Patients 6, 7, and 10 all showed identical patterns of mutation, genetics alone weren’t enough to determine the path of transmission. So he relied on other epidemiological data. The hospital told him that Patient 10 had been in the ICU long before Patients 6 or 7, and her stay had overlapped with Patient 4’s. So Snitkin knew that Patient 4 had passed his bacteria to Patient 10, where it lay dormant until infecting Patient 6 and then 7. Patient 7 went on to spread bacteria directly or indirectly to seven more patients.
But Snitkin first had to come up with some DNA. Even that simple step can prove tricky. First, you need to prepare batches of the bacteria in flasks. Then, you need to add enzymes to the colonies to tear open their membranes. This reduces the bacteria to a stew of proteins and other molecules, one of which is their DNA. At this point, you add special chemicals that latch on to all the non-DNA molecules—chemicals that can then be washed away, functioning as a kind of biologist’s soap.
It all sounds straightforward, but there’s actually no manual for how to extract DNA from every species a microbiologist may encounter. When Segre’s lab members tried this process with Acinetobacter, they failed to obtain DNA from it: The chemicals they used were too gentle to pull apart the bacteria. So they tried again using a harsher recipe. Now they ended up with lots of DNA, but shredded into tiny bits too small for the 454 machine to read. It took weeks to finally get it right. “We made some pretty crappy DNA,” Segre says with a rueful laugh.
This time, thankfully, the lab knew what it was doing. Starting with the protocol from the Acinetobacter work, Snitkin played with the enzyme levels to increase his yield of DNA. He prepared batches of KPC from the first four patients, and over the course of a few days he extracted the DNA from pellets of cellular material. On September 9, as the outbreak at the Clinical Center was worsening—one of the infected patients had just died, while five patients had tested positive in total—Snitkin put his molecules in plastic tubes, packed them in dry ice, and handed them over to a courier, who drove them 6 miles to NISC. The 454 machine needed a day or so to read the fragments: A KPC genome is made of 5.7 million base pairs, roughly 500 times smaller than a human genome. Then it took several more hours for NISC’s computers to piece the fragments together into longer sequences. After that, the “finishing group,” a special team at NISC, inspected the data to make sure it hadn’t been spoiled—that the fragments had been large enough and abundant enough to ensure an accurate sequence, for example.
As the finishing group gave each genome its stamp of approval, Snitkin could then download it from NISC’s server. Now, finally, he could read the outbreak like a book.
Each time a microbe divides, it makes a copy of its DNA. In most cases, the duplicate is a perfect match to the original. Every now and then, though, the microbe makes a typographical blunder of sorts, and a mutation is born. Every descendant of that mutant microbe will carry that difference as a genealogical marker. Over generations, more and more markers will accumulate in a lineage of bacteria, distinguishing it from distant relatives.
Snitkin wrote a program to analyze KPC genomes for these markers. The sequences from the four patients were nearly identical, but he did find some tiny differences. For example, KPC has a gene for a protein thought to help build fimbriae, tiny appendages that help bacteria latch on to things. The gene is 2,556 base pairs long, and the version carried by the KPC in Patient 2’s trachea was identical to Patient 3’s sample, with one exception: The 1,511th base had changed from adenine to guanine.
Now Snitkin could hazard some answers to the questions that Palmore had posed about the outbreak. Perhaps the most urgent mystery was how it had started. Was it with Patient 1, the woman from New York, or had there been two different incursions—or perhaps even more? The standard hospital tests didn’t provide a clear answer. “We couldn’t tell whether we had two separate introductions or if it was all transmitted down the line,” Palmore says. “We only had the order in which we identified it.”
The odds that any given base of DNA mutates when a microbe divides is tiny. But as the generations pass, mutations will accumulate in bacteria at a roughly steady rate. Two microbes close together on the tree will be genetically identical or distinguished by just a handful of mutations. Distantly related microbes, on the other hand, will have hundreds or thousands of distinct mutations, because they have diverged from a much older ancestor. Thus, Snitkin expected that if the KPC outbreak were the result of several invasions, the genomes of bacteria from different patients would be markedly different.
But that’s not what he saw when he stared at the 0s and 1s on his screen. Base after base, for millions of bases at a stretch, the genomes were a nearly perfect match, with just a few mutations breaking up the monotony. Here was Snitkin’s first disturbing discovery: Patient 1 alone had seeded the Clinical Center with her bacteria. The evidence was conclusive but also puzzling. Patient 2 arrived at the ICU two weeks after Patient 1 had left. How could the bacteria have survived that long without another case emerging in between?
With the analysis, Snitkin was able to trace the likely path of the bacteria from patient to patient—information that was crucial in understanding the outbreak.
And for Snitkin and Segre, she had one simple request: Bring me more genomes.
Snitkin now began to chart the course of the outbreak from Patient 1 to the other three patients. Patient 2 was a 34-year-old man who had been in the hospital for cancer and died from his KPC infection. Patient 3 was a 27-year-old woman with an immune-system disorder. Patient 4 was a 29-year-old man with lymphoma. If the bacteria from Patient 1 infected Patient 2, then the latter genome shouldn’t differ from the former at more points than the other genomes do.
Again, though, the data gave the lie to this seemingly logical expectation: Patient 2’s bacteria carried an extra mutation that didn’t appear in Patient 3. The only way it could have developed was during transmission. The order in Snitkin’s tree was clear: Patient 1 -> Patient 3 -> Patient 2.
The mutations showed Snitkin how the bacteria had started to spread. Patients 1 and 3 had both been in the intensive care unit in late June. In July, after Patient 3 picked up the bacteria, Patient 2 arrived in the ICU and acquired it from her, or from some intermediate patient. This was the revelation that left Palmore speechless. Patient 3 had tested positive only in mid-August, which means her colonization must have gone undetected for weeks. Even though she was already gravely ill and her immune system was compromised, the bacteria had not become aggressive during that entire period.
The story of Patient 4 was even more mysterious. Palmore looked through her files to find a direct connection with Patient 1, but there was none. This meant that some unknown patient had served as a bridge. Palmore pored through records of 1,115 patients and found five people who’d had the opportunity to spread KPC from Patient 1 to Patient 4, but none tested positive. The tree revealed that KPC was far more cunning than anyone had appreciated.
The Clinical Center had already gone to great lengths in its attempt to contain the superbug. To keep Patient 1 from infecting other people in the Clinical Center, Palmore had put the woman at the far end of the ICU, flanked by empty rooms. Everyone going into Patient 1’s room had to put on a gown and gloves and wash their hands upon entering and leaving. Her nurses could not treat anyone else in the hospital. When the three new cases had emerged, Palmore had “cohorted” the colonized patients, moving them all into a single large room where she could control the traffic in and out. Back in the main ICU, KPC-negative patients had been given a separate team of nurses and therapists. Palmore assigned staffers to roam the ICU and the cohort area, urging patients and staff alike to follow the rules. “We had a cadre of monitors who really understood what they needed to do,” Palmore says. “They worked up the courage to walk up to the chief of surgery and say, ‘May I offer you a second shot of hand gel?'”
But after her meeting with Snitkin and Segre, Palmore saw that her precautions were maddeningly insufficient; the bacteria were still slipping through the walls of defense. Worse, Segre and Snitkin’s research suggested that the bacteria had gotten out of the ICU. Some of the potential bridge patients between Patient 1 and Patient 4 could have passed on the bacteria outside of the ICU. Palmore’s fears were confirmed the same day: A new case of KPC turned up on a regular ward, in a patient who had never visited the ICU. He hadn’t come to the bacteria; the bacteria had come to him. Already eight people had been colonized by the bug, and it showed no sign of ending its rampage.
With DNA in hand, Palmore now ramped up her counterassault. She ordered tests on every patient in the entire hospital and required doctors, nurses, and other staff to have their hands swabbed. They continued to test walls, shelves, and countertops. Though no employees came up positive for KPC, the facility itself did in five different places. The bacteria was found even in the ventilation tube of an infected patient—after it had been cleaned with both ammonium and bleach. When Palmore discovered KPC in six sink drains, she had a team of plumbers pull out the pipes and bleach them.
And for Snitkin and Segre, she had one simple request: Bring me more genomes.
Snitkin settled into a system. As soon as the hospital microbiologists saw the bacteria pop up in a test, he would come over and transfer a sample to a flask. He would break open their cells, extract their DNA, and send it to NISC, where their genomes would be sequenced, assembled, and then sent back. Snitkin would tell Palmore the results, and he would add a new branch to the KPC tree. By this point, there was too much data for Snitkin to parse simply by staring at 0s and 1s. So he wrote a new program that could automatically determine the most likely path of infection based on the mutations in the KPC genomes. Sometimes the mutations in a microbe’s genome would clearly show how it had spread from another patient. But other times identical bacteria would turn up in different people, leading Snitkin’s software to produce a tangle of branches. “This is the algorithm just throwing up its hands and saying, ‘These are all the same. I don’t know what to do,'” Snitkin says. A number of different transmissions might give rise to the patterns in the data. To break these impasses, Snitkin fed Palmore’s epidemiology into his computer so it could use the additional information to judge which contact was the mostly likely to have spread the bacteria.
The picture of the outbreak was now looking like a series of circles and arrows, branching like a tree as the bacteria was detected in more patients. The pattern of the branches took on a startling shape. When Segre and Snitkin had first met with Palmore, they told her that Patient 1 had transmitted KPC to at least two separate patients in the ICU. By early October, they had drawn three branches from Patient 1. The most remarkable of these led to Patient 8, a 71-year-old man with lymphoma. It was baffling how Patient 1 had infected him, since he had never been to the ICU and Patient 1 had never stayed on his ward.
Palmore could also track the spread of the bacteria from patients to the hospital environment. When Snitkin and Segre sequenced the bug from an ICU ventilator, they discovered it had come from a KPC patient who had previously used the equipment—and it had survived a supposedly sterilizing clean. As the bacteria spread, it was also adapting to the antibiotics the doctors were using. When KPC colonized Patient 2, for example, it acquired a new, two-base-pair mutation that likely blocked colistin, a last-line-of-defense antibiotic, from passing through its cell membrane. “Once the bacteria became resistant to colistin, there was really nothing left to treat him with,” Snitkin says. Not long afterward, Patient 2 was dead.
Soon Palmore had more tools at her disposal. Adrian Zelazny, acting chief of the hospital’s Microbiology Service, suspected that the standard test for KPC—swabbing the throat and groin—was missing many infections. A rectal swab was much more accurate, because the gut was such a welcoming place for the bacteria, so Zelazny established new protocols to add those swabs to the standard regimen. And rather than waiting for the bug to grow overnight in a tube, Zelazny figured out by November how to test the swabs directly for genetic material unique to KPC. When Palmore came across a newly infected patient, Snitkin would put the bacteria through his pipeline and send her the results. All he and Segre could do now was watch as Palmore carried on the fight.
Palmore was still catching new patients, but the evidence suggested that she was catching them sooner. Between November 17 and November 27, two men and a woman tested positive for KPC. None of the three had yet developed an active infection. This was promising, because it meant that Palmore was now spotting the slow-growing bacteria while it was still in its early period of silent colonization—a time when it could more easily infect other patients. By identifying the cases early, Palmore was shrinking the window during which the bacteria could spread.
After November 27, when Palmore discovered Patient 17, a week passed without another positive test. Then another week. On December 14, another patient tested positive for KPC—Patient 18, a 37-year-old woman with sickle-cell disease—but the bacteria was colonizing her gut and nowhere else. Palmore quarantined her with the other KPC-positive cases, but Patient 18 never developed an active infection. December rolled into January without another case of KPC. And then January ended. The outbreak was effectively over.
Superbugs gain much of their strength by moving from hospital to hospital, finding new hosts, and evolving greater resistance to antibiotics.
Since then, there have been stragglers. Three cases of colonization emerged in 2012, and while two of them proved to have originated outside the hospital, the third came from inside. Worse still, this last case, first caught in July, turned into an active infection in August; on September 7 the patient died, raising the KPC death toll to seven. “All of us were crestfallen,” Palmore says. Snitkin and Segre analyzed the genome of the patient’s KPC to trace its origin. It turned out to have originated in one of the last patients in the outbreak (the center declines to specify which). Somehow, after seven months—likely during a follow-up visit at the hospital—the patient’s bacteria escaped Palmore’s isolation, traveled to one of the medical wards, and found a new victim. But no more colonizations or infections have surfaced in the intervening months, and Palmore is cautiously optimistic that the outbreak has now finally come to an end.
It’s unlikely that most US hospitals will be able to fight their superbug outbreaks the way that Tara Palmore, Julie Segre, and Evan Snitkin fought theirs—at least not anytime soon. The NIH Clinical Center had access to a scientific brain trust and a massive genome sequencing center to go with it. For now, smaller hospitals don’t have a labful of sequencing equipment, let alone the necessary expertise.
But every year, the technology becomes radically less expensive. To handle future outbreaks, Segre and Snitkin are now building a pipeline based on the new MiSeq Personal Sequencer made by Illumina. They expect this to bring the cost of sequencing a bacterial genome down from $2,000 to $500, while reducing the time from sample to analysis to just a couple of days. Newer sequencers, made by companies such as Oxford Nanopore Technologies and Ion Torrent, might reduce the expense even more.
The expertise, meanwhile, can be supplied by machines: Software should soon be able to put the basics of phylogenetic tree-building into anyone’s hands. Bioinformatics professionals like Snitkin will continue to write their own code, but there’s no reason the basic principles couldn’t be turned into an off-the-shelf package. John Gallin, director of the NIH Clinical Center, is eager to help: “We want to make this cheap and easy and hand it over to other hospitals so you don’t have to be an NIH scientist to do it,” he says. Genetic epidemiology would become even more powerful if hospitals shared their stories—and, more important, their data. Superbugs gain much of their strength by moving from hospital to hospital, finding new hosts, and evolving greater resistance to antibiotics. If doctors like Palmore can compare the genomes of their pathogens to ones from other hospitals—at the start of an outbreak, rather than years afterward—they will be able to respond intelligently and save more lives. Because when you’re faced with bacteria that can now withstand just about every drug known to man, the best you can do is not get sick.
But there’s one thing that is clear: We can no longer assume drugs will let us win against superbugs. We can’t expect some bigger or better weapon to arrive at the front and obliterate our unseen enemies. Our best hope now may be to fight them by knowing what they are and divining how they spread—as fast as we can read their DNA.
Copyright 2013 Wired. Reprinted with permission.