Mimicking the human immune system to detect outbreaks faster

Synthetic T-cells monitor multiple variables for nuanced alerts
Sandia notes that T-cells are a type of white blood cell that recognize and kill virus-infected cells and other foreign pathogens. They recognize the foreign invaders after undergoing a negative-selection “training” process where every T-cell that attacks normal body cells is destroyed. Other than this initial “training,” there’s no central “brain” controlling the T-cells.

Finley thought that mimicking how T-cells work might speed up outbreak detection. In 2015, he began collaborating with immune system modeling experts at UNM as part of Sandia’s Academic Alliance program. The Academic Alliance is a partnership Sandia has built with five universities to promote collaborative research on tough problems and attract top talent to work on these challenges.

“The adaptive immune system in vertebrates is one of the most complex systems in biology with trillions of cells, dozens of cell types and signaling molecules,” said Melanie Moses, professor of computer science and biology at UNM involved in the project. “Through computer modeling and simulation, we understand how the immune system works which, in the long term, can lead to improved immuno-therapies, allergy treatments and vaccines. It also provides inspiration for the design of other decentralized systems for surveillance and protection.”

Working together, the team created synthetic, mathematical “T-cells” that look at multiple different variables at the same time, such as number of clinic visits, day of the year and intake temperature. Then, mimicking the T-cell negative selection process, Levin ran the synthetic T-cell algorithms against past data collected by the CDC and New Mexico Department of Health. He compared the algorithms and selected the most accurate.

In 2016, initial tests on a pilot-scale biosurveillance system showed that Levin’s synthetic T-cells performed better than the traditional statistical methods, said Finley. Also, because the synthetic T-cells track multiple variables intrinsically, they could provide more nuanced alerts, such as separating an outbreak of a new disease from seasonal influenza, he said.

Brain-inspired machine learning improves chief complaint deciphering
The first piece of data the CDC receives from each emergency room visit is called the chief complaint. This is a concise statement describing why a patient has gone to the emergency room or clinic, before they’ve seen a doctor and have been diagnosed. Chief complaints range from “chest pain” and “fever three days” to specialized abbreviations.

These terse statements are full of medical jargon and even misspelled words, making them difficult to decipher by simple keyword searches or by the inexperienced. Also, many words describe the same symptoms, such as fever, hot, temperature and chills.

Technology companies have been using deep learning for similar natural-language processing problems. Deep learning is brain-inspired machine learning that excels at finding patterns without being explicitly programmed on what to look for. One such algorithm, called Word2vec, converts the context of words into mathematical vectors.

Sandia says that when Levin ran the Word2vec algorithm on anonymized chief complaint data collected by the New Mexico Department of Health, it out-performed a standard keyword search, as well as other state-of-the-art machine learning algorithms. However, it still had troubles with misspelled words and abbreviations.

To work around this, Levin tried two related neural network algorithms: one that converts letters into vectors and another that converts words into random vectors. The algorithm that converted words into random, or untrained, vectors was most accurate, possibly because the trained Word2vec algorithm places antonyms too close together in vector space, said Levin.

Though more optimization is needed, the team’s deep-learning algorithm for deciphering chief complaints could be particularly useful for the opioid epidemic, said Finley. He added, “New terms for street drugs tend to appear much more quickly than the public health community realizes. If we find that a weird word is popping up a lot in an area, it could be a new variety of fentanyl.”

Future of distributed biosurveillance centers
Lymph nodes are distributed throughout the body and act as immune system hubs, chock full of T-cells and the B-cells that produce antibodies to fight off infections.

Finley and his team are just beginning to explore how mimicking lymph nodes might improve the biosurveillance system. Finley believes it would be particularly helpful in detecting outbreaks of regional diseases like Lyme disease, plague and Hantavirus. Also, distributed detection algorithms could be more efficient by bypassing the physical and power consumption limits that Moore’s Law computers are now running up against, added Levin.

“We are working closely with the CDC to test a number of our deep learning approaches on a subset of the national data flow,” said Finley.

His goal is to have his biologically inspired system set up by October to allow side-by-side comparisons with the traditional statistical methods at the national scale. He believes the different approaches will have different strengths, and combining them will improve the speed and accuracy of outbreak detection.

This research was funded by Sandia’s Laboratory Directed Research and Development program. Sandia computer scientists Walt Beyeler and Michael Mitchell and UNM postdoctoral fellow Tatiana Flanagan also worked on the project, focusing on the lymph system-mimicking distributed detection algorithm.

“This project with Sandia has provided us with an opportunity to test the practical application of the concepts we’ve learned from our models,” said Moses. “Ultimately, this project will lead to a more complete understanding of the immune system, as well as a practical way to quickly identify and respond to disease outbreaks and other biological threats.”