HSNW conversation with Hirotaka Nakasone of the FBIVoice recognition capabilities at the FBI -- from the 1960s to the present

Published 11 July 2012

Chris Archer, the online content editor at IDGA (the Institute for Defense & Government Advancement), talked with Hirotaka Nakasone, a senior scientist in the FBI’s Voice Recognition Program; Nakasone examines the use and effectiveness of current speaker authentication technologies at the FBI; highlights the various challenges which are unique to voice recognition, and discusses what plans are in place for capturing voice recordings in line with the FBI’s Next Generation Identification (NGI project)

Chris Archer: Examine the use and effectiveness of current speaker authentication technologies at the FBI
Hirotaka Nakasone
: The FBI’s use of the speaker recognition technology dates back to the early 1960s.  A team of FBI special agents and technical support personnel began to develop a protocol to perform voice comparison examinations by using the sound spectrograph.  But this spectrographic technique had always been used only as investigative guidance — never had been introduced in the court of law due to the inconclusive nature of the technology.  Concerned about the controversial nature of the technique and inconsistent admissibility status of the technique in criminal proceedings, in 1976 FBI commissioned the National Research Council of the National Academy of Sciences (NAS) to review and assess the status of the spectrographic speaker recognition.  The results of the NAS’s study were published in 1979 titled “On the Theory and Practice of Voice Identification.”  Subsequent to this NAS report, the FBI determined to continue its original policy on the spectrographic voice identification, that is, to use it only as investigative guidance.  This practice prevailed for the next three decades.

 In late 1990s the FBI began the development of the automated speaker recognition technology by the leading research groups sponsored by the FBI and other U.S. government agencies.  This effort was accelerated by the sponsorship of the Biometric Center of Excellence (BCOE) in 2007.  Currently the FBI offers forensic speaker recognition analysis services by using the automated speaker recognition technology for its field offices within the U.S. and abroad.  Here are some highlights of the FBI’s current speaker recognition technology.

  • Conducted within an FBI’s forensic unit within Digital Evidence Laboratory that is accredited by the American Society of Crime Laboratory Directors — Laboratory Accreditation Board (ASCLD/LAB)
  • Conducted by fully trained examiners with technical and engineering background
  • Conducted by using multiple sets of advanced state-of-the-art speaker recognition algorithms
  • Conducted under standard operating procedures
  • Conducted only for investigative and intelligence purposes — not for courtroom purposes
  • The primary speaker recognition system is capable of conducting channel-independent and language-independent recognition under a certain set of forensic conditions with known reasonably acceptable levels of accuracy 

Archer: What biometric challenges are unique to voice recognition?
Nakasone
: I want to address three challenges unique to voice recognition. Please note that these challenges were also recognized by the 2011 NSTC Biometric Challenge-Update as well.  Challenge #1 is dynamic nature of human speech production that changes constantly as a function of time, therefore it requires