Scientists Model “True Prevalence” of COVID-19 Throughout Pandemic

Other statistical methods often try to correct the bias in one data source to model the true prevalence of disease in a region. For their approach, Raftery and lead author Nicholas Irons, a UW doctoral student in statistics, incorporated three factors: the number of confirmed COVID-19 cases, the number of deaths due to COVID-19 and the number of COVID-19 tests administered each day as reported by the COVID Tracking Project. In addition, they incorporated results from random COVID-19 testing of Indiana and Ohio residents as an “anchor” for their method.

The researchers used their framework to model COVID-19 prevalence in the U.S. and each of the states up through March 7, 2021. On that date, according to their framework, an estimated 19.7 percent of U.S. residents, or about 65 million people, had been infected. This indicates that the U.S. is unlikely to reach herd immunity without its ongoing vaccination campaign, Raftery and Irons said. In addition, the U.S. had an undercount factor of 2.3, the researchers found, which means that only about 1 in 2.3 COVID-19 cases were being confirmed through testing. Put another way, some 60 percent of cases were not counted at all.

This COVID-19 undercount rate also varied widely by state, and could have multiple causes, according to Irons.

“It can depend on the severity of the pandemic and the amount of testing in that state,” said Irons. “If you have a state with severe pandemic but limited testing, the undercount can be very high, and you’re missing the vast majority of infections that are occurring. Or, you could have a situation where testing is widespread and the pandemic is not as severe. There, the undercount rate would be lower.”

In addition, the undercount factor fluctuated by state or region as the pandemic progressed due to differences in access to medical care among regions, changes in the availability of tests and other factors, Raftery said.

With the true prevalence of COVID-19, Raftery and Irons calculated other useful figures for states, such as the infection fatality rate, which is the percentage of infected people who had succumbed to COVID-19, as well as the cumulative incidence, which is the percentage of a state’s population who have had COVID-19.

Ideally, regular random testing of individuals would show the level of infection in a state, region or even nationally, said Raftery. But in the COVID-19 pandemic, only Indiana and Ohio conducted random viral testing of residents, datasets that were critical in helping the researchers develop their framework. In the absence of widespread random testing, this new method could help officials assess the true burden of disease in this pandemic and the next one.

“We think this tool can make a difference by giving the people in charge a more accurate picture of how many people are infected, and what fraction of them are being missed by current testing and treatment efforts,” said Raftery.