SAVE: Pandemic's Urgency Drove New Collaborative Approaches Worldwide

The Nature paper notes, “The process is collaborative and iterative, with seven teams using independent models and methodologies to prioritize mutations and lineages as well as rank importance for downstream testing. While the focus is on human infections, the Early Detection group also monitors variants circulating in animal populations, such as mink and deer, since they represent a potential reservoir source.”

On a weekly basis, the SAVE Early Detection and Analysis team reviews downloads of SARS-CoV-2 genomes from the international initiative for sequence sharing, GISAID. They search for variant and co-variant signatures in the genomes, then divide the work into two approaches: 

·  one based on convergent evolution as the main signal for selection and functional impact of mutations (done byCambridge and Walter Reed Army Institute of Research teams)

·  the other anchored on prevalence and growth patterns of mutations and defined lineages (the role of Los Alamos, Icahn School of Medicine at Mount Sinai, J. Craig Venter Institute/Bacterial Viral Bioinformatic Resource Center, UC-Riverside and Broad Institute teams)

Highlights of Los Alamos Impact
At Los Alamos, the Korber team identifies emergent mutational patterns within the SARS-CoV-2 spike protein to track newly emerging and expanding variants and determine transitions in global and regional sampling frequencies over time, which is the specialty area in which Los Alamos has made a huge impact.

They pay particular attention to mutations in parts of the spike protein known to be highly targeted by antibodies, or that might impact infectivity. They also systematically define the most commonly circulating form of each emerging variant of interest or concern against the backdrop of the continuously evolving virus.

“Identifying the emerging variants, and obtaining accurate sequences for those variants, required continued wrangling of burgeoning data,” said Theiler. “There are now close to 10 million SARS-CoV-2 sequences in GISAID. These sequences, however, are non-uniformly sampled, are often partial and some contain errors, and of course it is the newest variants that give the sequencers the most trouble.”

“The tools we developed, along with our colleagues on the LANL COVID-19 Viral Genome Analysis Pipeline (cov.lanl.gov), provided the infrastructure that enabled us to follow this pandemic though its various waves,” he added.

Korber noted that “by working with the SAVE Early Detection team, we were able to be part of a synergistic collaborative effort, where our results in terms of early detection could be cross-checked with those of others.”

She added, “The real beauty of being part of the larger SAVE project was the knowledge that our analysis pipeline could provide foundational support for the many experimental teams in SAVE, and that we could help the scientific community get the best version of newly emergent variants into their laboratories as quickly and accurately as possible. In this way the science needed to understand the immunological and virological characteristics of new variants was rapidly obtained, in time to help inform public health decisions.”