Food safetyArgonne software help decode German E. coli strain

Published 14 October 2011

In the early days of annotating genomes in the mid-1990s, it took four or five scientists more than a year to analyze just one genome; now, with the help of Rapid Annotation using Subsystems Technology (RAST), which was developed by Argonne scientists, researchers needed only eight hours to sequence the genome of the rogue E. coli strain which struck Europe this summer; the next-generation RAST will cut this time to just fifteen minutes

Genome sequenced and mapped, E.coli becomes more vulnerable // Source: ecoliblog.com

When a nasty strain of E. coli flooded hospitals in Germany this summer, it struck its victims with life-threatening complications far more often than most strains — and the search for explanation began.

Over a busy weekend after the rogue bacterium’s genome was sequenced, scientists from all over the world submitted the E. coli genome to rounds of rigorous study. Thanks to a unique Argonne-developed computer program and cloud computing testbed, researchers mapped the strain’s genes—and came a little closer to understanding the bacterium’s secrets.

An Argonne National Lab release reports that a team of Argonne scientists developed the Rapid Annotation using Subsystems Technology (RAST) program in 2007. The program, which is free and open to any scientist, is designed to make sense of the jumble of letters that makes up an organism’s DNA.

A genome is a long, incomprehensible string of letters in a four-letter alphabet: G, A, T, C. Sections of the string are divided into genes. Each one describes how to build a protein, and proteins build all of the parts of the cell.

If we can figure out what DNA codes for which protein, and what that protein does, then we can look at any bug and have an idea of what it can do,” explained Ross Overbeek, an Argonne computer scientist who helped design RAST.

For example, bugs with multi-drug resistance often turn out to have little pumps that drain the drug out of the cell as fast as it comes in,” Overbeek said. “Once you know what those pumps look like, you can think about how to get around them.”

RAST matches sections of the new string with its enormous catalogue of previously sequenced genes and proteins. At the end it spits out an annotated genome with a sort of “Cliffs Notes” to the organism’s probable genes and proteins.

When scientists, on 3 June, announced they had sequenced the genome to the E. coli strain that plagued Europe, researchers from around the world began sending versions of the genome to RAST for annotation. They wanted to compare the new strain with past strains to tease out its origins and vulnerabilities.

Genomes can vary even within a strain,” Overbeek said. “You can get slightly different genomes in the same outbreak, even from the same patient. You compare genomes to see how the organism is mutating even as it’s wreaking havoc.”

The release notes that RAST servers were already overwhelmed by a flush of