July 27, 2012
International Computer Science Institute Partnering with Microsoft to Improve Voice Recognition Technology
By Jacqueline Lee
Voice recognition technology for services like hosted IVR has come a long way in the past few years. However, according to blogger B.K. Winstead, voice recognition falls short in understanding an aspect of speech called prosody.
For this reason, Microsoft (News - Alert) is pairing up with the International Computer Science Institute (ICSI), an independent institute affiliated with the University of California at Berkeley, to begin research on improving the understanding of prosody by voice recognition interfaces.
Winstead compares prosody to the music of language. The dictionary defines prosody as “the stress and intonation patterns of an utterance.” In other words, prosody communicates the intent behind what we say, whether we choose to emphasize certain words or to adopt a certain inflection in our speech.
For example, anyone with an iPhone (News - Alert) has made a sarcastic comment to Apple’s voice recognition client Siri at one time. However, because Siri understands so little of prosody, she often interprets the comments literally or does not know how to respond. Improving a voice recognition interface’s understanding of the subtleties of language will better help it to recognize both the emotional state and the physical space of the user.
Microsoft Research personnel have worked on a project called Natural User Interface, which led to the gesture-based Kinnect interface for Windows. “One of the big challenges that we're actually focusing on is to develop a common framework to a number of these types of capabilities,” notes Elizabeth Shriberg, a principal Microsoft scientist, “where prosodic cues are used to do something -- some task. We've started doing this already; it's been implemented in a prototype in a lab at Microsoft.”
Microsoft would give no information about its research timeline, but analysts like Winstead believe that Siri and Google (News - Alert) Voice Search should watch out for the competition. “Step it up!” writes Winstead. “True voice management of computer systems could be all that much closer due to this ICSI/Microsoft partnership.”
Edited by Juliana Kenny