Web Search Formulas Offer a First Step for Protecting Critical Infrastructure

A lot can happen quickly. For example, if an errant squirrel takes out a power substation, the pumps at a water treatment plant might stop. That could threaten the water supply to a nearby hospital or to a nuclear plant that needs water for cooling. Of course, officials have robust backup systems, and part of their planning is knowing how to prevent such a cascade of failure as quickly as possible.

Kay’s team began by adapting the existing PageRank algorithm: Instead of looking at interactions among web pages, the scientists analyzed interactions among structures. Which facilities would be most likely to be targeted or to fail? And which facilities would have a serious impact on other facilities if they did fail? Structures that met both criteria were deemed critical by the team.

“As failure propagates through a network, there are two things I’d want to know: which things are likely to fail, and which things, if they fail, are likely to propagate the failure forward,” said Kay, who specializes in graph networks.

Layers of Knowledge
The team didn’t simply apply the PageRank formula but made modifications to weigh many streams of information simultaneously—akin to performing dozens of related web searches at the same time and having all the searches communicate among themselves. The team refers to this as a multilayer approach.

“To think of a multilayer algorithm, think of a multilayer sandwich—a club sandwich,” said coauthor Patrick Mackey. “One layer might be the electric system. Another is transportation. Others might be oil pipelines or hospitals. Many people look at these aspects of infrastructure one at a time, in isolation; we’re looking at them all together and how they affect each other, which helps identify which are most critical.”

In several simulations, the team showed that its multilayer approach consistently stops failure faster, with fewer structures damaged, than other approaches, including a straight PageRank algorithm and another approach known as “outdegree.” The team did not quantify exactly how much it would limit an attack compared to the other methods, instead treating the study as evidence that the approach is worth exploring.

“A good algorithm for this type of work doesn’t always need to incorporate detailed dynamics of the various entities of interest. Oftentimes it’s sufficient, as a start, to adequately understand the relationships between those entities,” said Kay. “It provides a starting point and can become very useful once a human expert adds in knowledge about the domain in question.”

The work is part of a project portfolio led by PNNL researcher Sam Chatterjee, principal investigator and chief data scientist. It was funded by the Cybersecurity and Infrastructure Security Agency to enable consistent, repeatable and defensible analysis across a broad spectrum of potential failures.  

“This work represents an excellent example of how network science methods can be adapted to address critical infrastructure risk and resilience challenges,” said Chatterjee.

In addition to Chatterjee, Kay and Mackey, former intern Jacob Miller also contributed to the project.

Tom Rickey is Senior Science Writer, News and Media Team, Pacific Northwest National Laboratory (PNNL). The article was originally posted to the website of PNNL.