November 07, 2012
Google: More Data Is Better When It Comes to Voice Applications
While the management of big data is a current trend, so too is natural language and speech recognition; you really can’t have the latter without the former. Speech technologies simply have so very much data to manage, and in light of this, a new research paper from Google (News
- Alert) highlights the importance of big data in creating consumer-friendly services such as voice search on smartphones. The research reasons that more data helps train smarter models, which in turn can better predict what someone will say next — in other words, letting you keep your eyes on the road.
The new Google research people details the company’s speech recognition applications such as voice search and adding captions or tags to YouTube (News - Alert) videos, and how these technologies generally work. Whle the paper is very technical and may be beyond most people’s knowledge spectrum, it is still useful to read and understand how “big data” can combine with voice technology to make our lives easier in the future.
The gist behind the research is that more data makes better algorithms, and speech applications are no exception. More data comes in the form of more humans speaking, or “chattering.”
“For many years researchers knew the theoretical process for building speech-recognition systems, but they had no idea how to get enough human chatter, or enough computing power, to actually do it,” writes Slate’s Farhad Manjoo.
“Then came Google. It turns out that the very same infrastructure that Google needed to build a fantastic search engine—acres and acres of data centers to store and analyze websites, and a range of internal processes that are specifically tuned to managing large amounts of information—would also be effective for solving speech recognition and other artificial intelligence problems. The trickiest part, of course, is knowing which is the best data to use to move the model forward instead of holding it back with unnecessary baggage,” Manjoo adds, who interviewed Mike Cohen, the head of Google's speech system to better understand Google’s research.
To read the full Google paper, click here.
Edited by Allison Boccamazzo