AviationCould better tests have predicted the rare circumstances of the Germanwings crash? Probably not

By Norman A. Paradis

Published 2 June 2015

When people do terrible things, it seems reasonable to believe we should have taken steps to identify them beforehand. If we can do that, then surely we can prevent them from doing harm. The crash of Germanwings Flight 9525 in March, which appears to have been an intentional act, is an example. It shocks us (and understandably so) when a trusted professional harms those who have entrusted their lives to him or her. So why not identify pilots at risk and take steps to prevent similar events from ever occurring again? Because it is likely impossible, and maybe even counterproductive. The limits of what can be achieved in predicting an event represent a dilemma we face all the time in biomedical testing. It may be possible to prevent rare events such as the Germanwings tragedy — “smart” cockpit doors or some such technological solution. But predicting their occurrence by looking more closely at the individuals involved is doomed to fail.

When people do terrible things, it seems reasonable to believe we should have taken steps to identify them beforehand. If we can do that, then surely we can prevent them from doing harm.

The crash of Germanwings Flight 9525 in March, which appears to have been an intentional act, is an example. It shocks us (and understandably so) when a trusted professional harms those who have entrusted their lives to him or her.

So why not identify pilots at risk and take steps to prevent similar events from ever occurring again?

Because it is likely impossible, and maybe even counterproductive.

And that’s not just my opinion. The limits of what can be achieved in predicting an event represent a dilemma we face all the time in biomedical testing.

Let me take you through such an analysis, and show you how futile such programs would likely be in preventing events like the air crash in Europe.

Medical test can be sensitive or specific, but rarely both
Any interview or written survey instrument intended to identify individuals at risk of perpetrating rare and horrific acts is essentially a medical test. And the performance of such tests is described by its sensitivity and specificity. Simply put, sensitivity is the ability of the test to detect the disease, and specificity is the accuracy of its result.

For most tests, you make trade-offs between one or the other: sensitivity versus specificity. For instance, highly sensitive tests generally have many false positives — they call patients sick when the patient does not have the disease. And highly specific tests often have many false negatives — they miss many patients with the disease.

Generally, you can have a sensitive test or a specific test, but you can’t have a sensitive and specific test. Using a simple metaphor, this can be called the “no free lunch law” of medical testing.

This limitation becomes overwhelming when biomedical tests are used in populations with a very low incidence of the disease tested for.

An absurd example can help to understand this. Modern pregnancy tests are very accurate, over 99 percent. However, let’s say you apply a pregnancy test in a population of 10,000 men. You will get a handful of positive tests, 100 percent of which will be false positives.

For this reason, standard blood tests cannot generally be used to screen for very rare diseases without being paired with a second specific confirmatory test.