ACOUSTIC DETECTION Gunfire or Plastic Bag Popping? Trained Computer Can Tell the Difference

Published 15 December 2021

There have been 296 mass shootings in the United States this year, and 2021 is on pace to be America’s deadliest year of gun violence in the last two decades. Discerning between a dangerous audio event like a gun firing and a non-life-threatening event, such as a plastic bag bursting, can mean the difference between life and death. Engineering researchers develop gunshot detection algorithm and classification model to discern similar sounds.

According to the Gun Violence Archive, there have been 296 mass shootings in the United States this year. Sadly, 2021 is on pace to be America’s deadliest year of gun violence in the last two decades.

Discerning between a dangerous audio event like a gun firing and a non-life-threatening event, such as a plastic bag bursting, can mean the difference between life and death. Additionally, it also can determine whether or not to deploy public safety workers. Humans, as well as computers, often confuse the sounds of a plastic bag popping and real gunshot sounds.

Over the past few years, there has been a degree of hesitation over the implementation of some of the well-known available acoustic gunshot detector systems since they can be costly and often unreliable. 

In an experimental study, researchers from Florida Atlantic University’s College of Engineering and Computer Science focused on addressing the reliability of these detection systems as it relates to the false positive rate. The ability of a model to correctly discern sounds, even in the subtlest of scenarios, will differentiate a well-trained model from one that is not very efficient.

With the daunting task of accounting for all sounds that are similar to a gunshot sound, the researchers created a new dataset comprised of audio recordings of plastic bag explosions collected over a variety of environments and conditions, such as plastic bag size and distance from the recording microphones. Recordings from the audio clips ranged from 400 to 600 milliseconds in duration.

Researchers also developed a classification algorithm based on a convolutional neural network (CNN), as a baseline, to illustrate the relevance of this data collection effort. The data was then used, together with a gunshot sound dataset, to train a classification model based on a CNN to differentiate life-threatening gunshot events from non-life-threatening plastic bag explosion events. 

Results of the study, published in the journal Sensorsdemonstrate how fake gunshot sounds can easily confuse a gunshot sound detection system. Seventy-five percent of the plastic bag pop sounds were misclassified as gunshot sounds. The deep learning-based classification model trained with a popular urban sound dataset containing gunshot sounds could not distinguish plastic bag pop sounds from gunshot sounds. However, once the plastic bag pop sounds were injected into model training, researchers discovered that the CNN classification model performed well in distinguishing actual gunshot sounds from plastic bag sounds.

“As humans, we use additional sensory inputs and past experiences to identify sounds. Computers,