April 03, 2012
Siri IVR is Ready for Mandarin
Technology seems to make the world get smaller, and Interactive voice recognition (IVR) technology is playing a significant role. By converting the spoken word in one language into written text of another language, communication barriers are being razed to the ground.
According to this Business Week report, Apple (News - Alert) (News -Alert) is revving up excitement over its scheduled launch of a new form of Siri, a smartphone app which will provide IVR capability in Mandarin, Chinese later this year. The technology, though imperfect, is impressive when one considers that Mandarin is a tonal language containing 400 single-syllable sounds which are differentiated by intonation alone.
This means that the same sounds with slightly different intonations are entirely different, unrelated words.Nuance (News - Alert) Communications (News - Alert) based in Burlington, Massachusetts has paved the way for Apple’s Mandarin IVR technology. Nuance made two Mandarin IVR apps available without cost to consumers not long after the English counterparts were launched.
Called Dragon, the dictation technology converts words into text for emails, texting and the social media formats Facebook (News - Alert) and Twitter (News - Alert). Dragon Search does the same for Internet searches. Within just a few months, Nuance made Taiwanese and Cantonese variations available as well. To use Dragon, speakers utilize a virtual start and stop button for recording.
After initiating the recording session, speakers dictate and their words head to the server where their spoken language is transcribed into Mandarin text and then is sent back to their phone. Dragon appears to handle simple conversation fine, but when given more challenging verbiage (e.g.: sentences which contain several words with the same sound but requiring different intonations) success is less dependable.
The IVR technology laid hold of the proper vocalization (sound) but lacked the cognitive ability to discern which words would make an intelligent thought/sentence. Instead, in a sentence where there were multiple possibilities, Dragon grabbed the right sound but transcribed it to a word which made the sentence nonsensical. The good news is that Dragon’s IVR technology is designed to ‘learn’. In theory, since all the transcriptions occur on the server, over time, a knowledge base will accumulate. Essentially, Dragon will ‘learn’ the nuances of language and become increasingly proficient in constructing ideas out of words.
Dragon’s makers also say that the technology is even capable of detecting and allowing for accent differences. Because each streamed voice recording is archived and available for analysis, the technology promises to be continually refined and improved. For the present, the ultimate test for IVR –based apps seems to be how well they can speak Mandarin.
Edited by Juliana Kenny