There is a mystifying and weird 2001: A Space Odyssey-like feel about the concept, use and even the phrase “natural language” speech recognition. As the technology is aimed at mimicking humans talk, it lulls you into getting into a human-like interaction with the virtual androids on the other end of the lines, like Amtrak’s Julie, with the intent that you will use “her” and her ilk instead of zeroing out to “their” flesh-and-blood comrades.
Jeff Foley is a senior marketing manager at Nuance. An MIT (News - Alert) engineer-turned-marketer, he demystified natural language for TMCnet:
“If you’re considering an upgrade to your phone self-service, chances are you’re looking for the facts about “natural language,” said Foley. “It’s become one of those industry buzzwords -- much like ‘cloud computing’ -- which vaguely refers to many related technologies. There’s a misconception that natural language technology -- or its cousins, natural language processing and natural language understanding -- enables a caller to say almost anything to a speech recognition system. With all the confusion surrounding natural language, many don’t realize how it helps their self-service system. The truth is that natural language technology can involve one of several approaches, each effective for different tasks.”
To begin unveiling natural language, Foley said it’s best to first appreciate how automatic speech recognition (ASR) systems work.
An ASR system, he explained, understands humans the same way you might understand a foreign language if you were traveling. If you hear words that you recognize from a list in your trusty phrase book, then you can understand what people are saying -- as long as they stick to phrases on your list. In speech recognition, this list of understood words and phrases is called a grammar.
And it works quite well. A 98 percent accuracy rate is considered (by speech professionals) very good for a deployed speech recognition system. However, to say “the speech recognition system is 98 percent accurate” is to measure in-grammar accuracy -- without counting when a speaker says something that’s not on the list.
These “out of grammar” errors Foley points out are three to five times more likely to be the cause of a system rejecting a caller’s response than misrecognition. In fact, the average rate at which callers say things that aren’t on the list is about 16 percent. In some cases, it’s as high as 30 percent.
“In other words, the perceived accuracy of those systems—the accuracy that your callers experience—is only 70 percent,” reported Foley.
One way to address out-of-grammar issues he suggests is a bigger phrase book: make the list more comprehensive by putting in more guesses as to what speakers might say. This works… but only to a degree. It may even provide the illusion of natural language. Faking natural language with larger grammars, however, can lower the accuracy and speed of an ASR system.
True natural language circumvents the need to anticipate everything a caller might say, Foley pointed out. Without natural language, a confused machine may reject a caller’s perfectly reasonable answer (“Umm, I’d like to pay a bill please, ma’am”) because it wasn’t on the list of hard-coded responses.
“Natural language technology helps a machine better understand a human’s words because it can recognize a wider variety of responses, even if it’s never heard them before,” explained the Nuance (News - Alert) manager. “The machine studies examples of what people might say and creates statistical models that help it understand the caller’s intent (“take me to the payments menu”) without having to manually predict each variation. It’s the difference between traveling with a phrase book and being fluent in the local language.”
Yet is all “natural language” the same?
Not so. There are actually several techniques and technologies that are associated with the term “natural language.”
One of the principle uses for them is for effectively routing calls to the right self-service or agent, Foley pointed out. Instead of forcing a caller through a self-service menu maze, solutions such as Nuance’s Call Steering greet the caller with an open-ended question, like: “In a few words, tell me how I can help you today.” Your caller can respond with “Yeah, ummm, I have a question about my bill” or perhaps “I’m moving and I need to change my address.” Call Steering sends the caller directly to the appropriate self-service or live agent based on the request. When prompted this way, callers can voice their actual need rather than guessing at which menu tree to explore.
The real value of natural language isn’t limited to routing calls, noted Foley. The technology can also be combined with ASR to increase recognition of multiple-choice questions by automatically ignoring filler words and phrases (“umm,” “please,” “I’d like”) without the expensive, time-consuming process of explicitly specifying them in the grammar.
“Nuance’s SmartListener technology incorporates years of speech and language data to recognize many out-of-grammar responses,” reported Foley. “That’s extremely useful because multiple choice menus and “yes/no”-type dialogues make up about 70 percent of the questions in a typical automated system.”
Self-service system developers sometimes use specialized building blocks of speech that often incorporate natural language techniques. These create more intuitive speech-enabled dialogs that improve automation. Techniques such as one-step correction (“no, that’s four three four five”) and multi-slot recognition (“7 o’clock on Tuesday”) correctly anticipate and interpret the extra information callers often provide. These techniques are almost invisible to callers.
“All they know is they said something and the system recognized it,” explained Foley. “Since callers stick with the automated system, contact center managers see containment rates rise.”
Who uses natural language, and why? From a business standpoint, a successful natural language deployment means fewer upfront agent requests and caller hang ups, fewer misrouted calls, and better overall automation, says Foley. These systems have higher caller satisfaction ratings thanks to fewer retries and less wading through layers of nested menus.
The operational savings is quite significant. For instance, one contact center saw 10 percent more people use the self-service system’s main menu without needing live support. That alone saved $2 million/year. Another company saw their system’s out of grammar rate drop from 12.1 percent to 10.6 percent. If they’re taking a million calls a month, and spending $5 to take each call with a live agent, then they’re saving $900,000 a year, reports Foley.
Natural language systems are improving self-service automation in many industries: retail banks, brokerages and credit card issuers; telecom companies and “triple play” cable companies; and utilities. They are not limited to English; natural language systems exist in Spanish, French, German, Swedish and even Turkish and Finnish.
So why isn’t everyone using natural language everywhere? For even with all of the benefits, natural language systems are not always appropriate, Foley points out. Capturing, transcribing, and tagging sample data to train the first natural language systems used to take up a lot of resources. Sometimes, contact centers aren’t sure how to measure success or justify the added investment. The problems solved by natural language vary for each system. Therefore, the identification and baseline collection of key performance indicators (KPIs) before deployment is crucial to validate performance.
Fortunately, natural language today is easier and cheaper to deploy, reported Foley. Nuance’s researchers have found ways to decrease the amount of data required to initially deploy natural language, dramatically reducing upfront costs. The firm’s professional services team has developed tools to automatically employ design best practices and decrease time to market. Its business consulting team can identify applications which benefit from natural language. They will measure the KPIs that matter to your organization so you can demonstrate the benefits of natural language.
“The goal of every contact center’s self-service system is to deliver the best possible experience while also delivering the highest possible automation,” said Foley. “Natural language technology realizes these goals by improving the way computers understand humans.
“As natural language systems become even easier and cheaper to deploy, expect to see more and more contact centers take advantage of the technology to make it easier for customers to get the customer service they expect.”
Brendan B. Read is TMCnet’s Senior Contributing Editor. To read more of Brendan’s articles, please visit his columnist page.Edited by Tammy Wolf