For those fans of Apple's (News
- Alert) “supersmart virtual assistant Siri” and its “quick, witty replies... slightly snarky sense of humor,” you’d be interested in a good retrospective of speech recognition technology from industry observer Melanie Pinola, who reaches as far back as Audrey.
Audrey who, you ask? Read on.
The first speech recognition systems could understand only digits, she says, noting that Bell Laboratories designed the Audrey system in 1952, which recognized digits spoken by a single voice. Not much, granted, but you have to start somewhere. And bear in mind that this is the era when a “computer” was a roomful of vacuum tubes that could do payroll.
Ten years later, IBM (News
- Alert) demonstrated its Shoebox machine, at the 1962 World’s Fair, which could understand 16 words spoken in English. Again, not earthshaking stuff, but much research was being done around the world, setting the stage for the 1970s.
According to Pinola that’s when speech recognition technology really took off. The U.S. Department of Defense got the ball rolling by funding the DARPA Speech Understanding Research, which produced Carnegie Mellon's Harpy speech-understanding system, which could understand 1011 words, and had “a more efficient search approach, called beam search, to prove the finite-state network of possible sentences.”
In the 1980s things kept rolling along, and speech recognition machines could now understand thousands of words. Pinola does a good job describing the hidden Markov model, which, “rather than simply using templates for words and looking for sound patterns... considered the probability of unknown sounds' being words.
This kicked off everything from useful medical applications to Worlds of Wonder's Julie doll in 1987 -- "Finally, the doll that understands you."
Speech recognition In the 1990s improved along with computer processors, becoming viable for the mass market for the first time with the first voice portal, VAL from BellSouth (News
- Alert), in 1996 which was a bit of a mixed blessing, as Pinola says, it “paved the way for all the inaccurate voice-activated menus that would plague callers for the next 15 years and beyond.”
Of course speech recognition technology has exploded since 2000, with the arrival of the Google Voice Search app for the iPhone (News - Alert) one of the signal developments. And Like Google's Voice Search, Pinola says, “Siri relies on cloud-based processing. It draws what it knows about you to generate a contextual reply, and it responds to your voice input with personality.
The sky’s the limit with speech recognition, all we know for sure now is that we can’t tell what it will be like in five years. Which is the fun of it all, of course.
David Sims is a contributing editor for TMCnet. To read more of David’s articles, please visit his columnist page. He also blogs for TMCnet here.
Edited by Juliana Kenny