EbolaInfection projections: how the spread of Ebola is calculated

By Jonathan Keith

Published 29 October 2014

Ebola is an example of an emerging infectious disease (EID): one that has newly appeared in a population or has undergone a rapid increase in incidence. Bioinformatics plays a key role in detecting, monitoring and responding to EIDs. In the case of Ebola, the bioinformatics community has responded rapidly. For example, the current outbreak of Ebola in Sierra Leone was first detected in May, but by September a study reported sequencing 99 Ebola virus genomes from 78 patients diagnosed with the disease between late May and mid-June. Bioinformaticians have been developing and refining algorithms for sequence assembly since the late 1980s, and are constantly adapting them so they can handle new sequencing technologies and ever-larger scales of assembly. Bioinformatics is, and will continue to be, a core component of the international response to Ebola and other EIDs, and patients, medical staff and those close to them need all the help they can get.

The number of reported Ebola cases is doubling roughly every five weeks in Sierra Leone, and in as little as two to three weeks in Liberia.

The number of reported cases globally is projected to reach 10,000 by the end of October. The actual number of cases may be twice the official figure. So how are such figures estimated — and what can bioinformatics do to help control the disease?

The 2014 Ebola outbreak in West Africa appeared suddenly and spread rapidly, and is thought to have started with a single animal-to-human transfer in December last year. It’s an example of an emerging infectious disease (EID): one that has newly appeared in a population or has undergone a rapid increase in incidence. SARS and various strains of avian influenza are examples of EIDs.

EIDs are often zoonoses — animal diseases that have infected humans as hosts and become transmissible. Such “host-switching” events can happen anywhere at anytime, and preparedness to respond rapidly and effectively when this occurs is an important aspect of public health policy.

One parameter that epidemiologists use to quantify the rate of a disease’s spread is the basic reproduction number: R0 (R-nought).

This is the number of new cases generated on average by each infected individual, in idealized conditions. Diseases with R0 less than 1 are not likely to become epidemics, but those with R0 more than 1 have the potential to spread exponentially.

Current estimates for Ebola indicate an R0 of around 2 — higher than the R0 of some strains of influenza — although it varies between regions.

Other parameters that determine the spread dynamics of a disease include the length of time the disease takes to incubate, and the period of time during which diseased individuals are infectious.

A key parameter is the proportion of cases that are identified. Many cases, including some that result in death, are not reported, either because victims do not seek medical care, or because overwhelmed medical personnel might fail to accurately record all interventions.

This is important not only because under-reporting reduces the effectiveness of management strategies, but also because it can influence estimates of the other parameters mentioned above, particularly if there is variation in reporting levels across regions.

Attempts have been made by the Center for Disease Control to estimate the degree of under-reporting for Ebola, but these are currently not very accurate.