Sounding the Alarm: Exposing Audio Deepfake

This compounded challenge has become known as the “liars’ dividend,” where bad actors can exploit the ambiguity by simply claiming the opposite and labeling authentic evidence as a deepfake to deny wrongdoing or evade accountability. Distinguishing fake from real, and vice versa, has proven to be time-consuming, labor-intensive, and costly for law enforcement agencies.

As deepfakes become more rampant, there is a common misconception that a more advanced AI detector is all it will take to solve the problem. Traynor provides a cautionary note, highlighting that, while machine learning excels at identifying patterns it has encountered previously, it struggles when confronted with something new. With the rapid advancement of deepfake technology, AI finds itself consistently lagging and largely ineffective in addressing the evolving challenges posed by deepfakes.

This is where experts like Traynor and his team at UF play a crucial role. Their expertise in designing state-of-the-art defenses against fake audio and various cybersecurity threats positions the university at the forefront of this rapidly evolving field.

________________________________________

“Our perception of reality, our creation and consumption of information, and our interpersonal connections may undergo profound transformations, but our unwavering pursuit of truth must endure.”

— Patrick Traynor

________________________________________

During Traynor’s recent invitation to the White House, he engaged in discussions about the growing threat of robocalls and deepfake voices as the election nears, shedding light on the strategies and technologies that are being developed to counter the problem of fake audios. He spoke with Anne Neuberger, U.S. deputy national security advisor; Jessica Rosenworcel, chairwoman of the Federal Communications Commission; Lina M. Khan, chair of the Federal Trade Commission; as well as representatives from major telecommunications companies like AT&T, T-Mobile, and Verizon.

The research being conducted by Traynor and UF’s Florida Institute for Cybersecurity team to develop robust defenses against deepfake technology is currently funded by the National Science Foundation and the Office of Naval Research. This interdisciplinary research encompasses analyzing deepfake voice technology and examining intricate aspects of human voice and speech (such as prosody, in which varying emphasis on certain words changes the meaning of the sentence; and breathing patterns and turbulence flow generated by speech). Vocal tract recreation is thus aimed at distinguishing genuine human voices from deepfake audio more accurately.

In an effort to create more powerful tools to detect deepfake audio, Traynor and his research team also borrow techniques from the field of articulatory phonetics – applying fluid dynamics to model human vocal tracts during speech generation – to successfully demonstrate that deepfakes fail to reproduce all aspects of human speech equally. In a test that ran more than 5,000 speech samples, Traynor and his team proved that deepfakes fail to reproduce the subtle but uniquely biological aspects of human-generated speech.

By leveraging elements of speech that are inherently difficult for machine-learning models to fully replicate, Traynor and his team are on track to construct better, more improved detector forensic models that can account for 99.5% accuracy of deepfake detection. In addition, by using UF’s HiPerGator supercomputer, the team was able to successfully recreate the micro effects of turbulent flows generated by genuine speech, surpassing machine-learning models that can only simulate macro effects. This approach significantly enhances detector efficacy. The UF team’s cybersecurity research also looks at bolstering identity verification on smartphones, aiming to fortify what Traynor envisions as an entirely new frontier in voice technology defense.

“Consider this: if I receive a call showing that it is coming from the governor or the president on my device, my first instinct is to hang up, as I’d have no other way but to assume that somebody is attempting to trick me,” Traynor said. “So clearly, the concern lies not just in the ease of making a deepfake, but also in our inability to discern its origin.”

He added, “When technology pretty much lets us do whatever we want, it raises the question, ‘What do we trust’? We must do a better job in safeguarding communication and preserving authenticity and truth.”

As the pace of innovation continues to accelerate, UF and its experts are determined to lead the charge.

“Our perception of reality, our creation and consumption of information, and our interpersonal connections may undergo profound transformations,” Traynor said. “But our unwavering pursuit of truth must endure.”

Helen Goh is Associate Director, Marketing and Communications, at the University of Florida. The article was originally posted to the website of the University of Florida.

Share |

Sounding the Alarm: Exposing Audio Deepfake

Recent stories

Free Subscription