February 24, 2003: Voice Analyst of Alleged Bin Laden Tapes Discusses Methodology in Interview

Popular Science magazine carries a rare interview with Tom Owen, a voice analyst who has worked on identifying Osama bin Laden in recordings allegedly released by the al-Qaeda leader. Owen worked for US media on the identification of bin Laden’s voice in a November 2002 recording (see November 12, 2002), assisted by a captain of the Saudi Interior Ministry’s forensics department he had apparently been teaching at the time. Owen, one of only eight forensic voice analysts certified by the American Board of Recorded Evidence, and other US experts identified the voice as bin Laden’s, although a Swiss facility disagreed (see November 29, 2002). The interview describes Owen’s lab and how he works, pivoting off the November recording. Owen criticizes the Swiss analysis, saying that the advanced biometrics software the Swiss used cannot work with the noise on the tape, as it is “designed to work with perfect samples.” Cleaning up the tape would not help, as this would remove the high and low frequencies a biometric system needs to make its identification.
Voice Identification Methodology – To identify voices, Owen uses a spectrograph, which produces spectrograms—“a kind of graphic speech rendering that has changed little since the 1940s”—that are then compared. His favorite tool for analyses is a “piece of vintage equipment—a reel-to-reel Voice Identification 700 spectrograph built in 1973,” which “differs little from the analog machines US Army intelligence officers built to identify and track German radio operators during World War II.” When analyzing a new recording thought to be from bin Laden, Owen compares the spectrograms it produces with spectrograms from a known bin Laden interview, such as one he granted to ABC in 1998 (see May 28, 1998). According to the magazine, there are “only a half-dozen words in common between the November tape and the ABC interview,” although the standards of the American Board of Recorded Evidence demand 20 identical words, preferably spoken in the same order.
Listening for ‘Quirky Mannerisms’ – However, Owen also listens for “the multitude of quirky mannerisms and pronunciation foibles peculiar to each voice,” because a trained ear can detect “the subtle whistle caused by a missing tooth, a person’s tendency to swallow in the middle of a sentence, even the way someone sets his or her jaw when speaking.” Owen plays the reporter what he calls a short-term memory tape, apparently a crucial tool in aural voice identifications. The spliced tape toggles between 2.5-second segments of bin Laden’s ABC interview and the November tape; Owen uses the tape to listen for peculiarities in a voice, especially when vowels are spoken. According to Owen, who says bin Laden’s voice is what the magazine calls “plenty peculiar,” the tape proves it is the “same guy” on the November tape and in the 1998 interview. However, the reporter comments: “To my untrained ear, it could be Darth Vader behind the static.… This is the sort of gray area that tends to make legal observers worry about the state of forensic science.”
Comments on NSA – According to the magazine, Owen’s technology is similar to that which the NSA probably uses to analyze voices, although Owen thinks the NSA has samples of bin Laden’s voice he does not. However, he does not think it has made biometric breakthroughs in analysis despite its advanced technology, which is “mostly devoted to listening.” [Popular Science, 2/24/2003]

Stay Informed