Q&A with i6net and Vestec on Asterisk-platform ASR
November 02, 2010
Asterisk (News - Alert)-based open platforms are entering the mainstream as a lower-cost, more flexible and obsolescence- resistant alternative (this last feature is especially salient what with mergers and acquisitions amongst business communications suppliers) to standard, traditional proprietary software for routing.
Asterisk is now becoming adopted and deployed for automated speech recognition (ASR) with new, more intuitive and robust solutions.
Earlier this year I6net, which specializes in sophisticated VoiceXML (News - Alert) browsers for speech applications, and Vestec, whose forte is in devising robust affordable speech recognition engines partnered to integrate of their products for the Asterisk telephony platform. The combination of I6net VXI* VoiceXML browser and the Vestec speech engine both firms said “significantly reduces the cost, time, and difficulty of implementing sophisticated speech recognition solutions with Asterisk.”
TMCnet recently interviewed Ivan Sixto, co-founder and president of i6net and Fakhri Karray, co-founder and president of Vestec about Asterisk and their Asterisk-based ASR solution.
TMCnet: What are the benefits of the Asterisk open platform and what type of organizations benefit the most from them?
IS: Asterisk is a powerful platform that is both an open source toolkit for telephony applications and a full-featured call-processing server. It can be a standalone system or used as an adjunct to a previously existing PBX or Voice-Over-IP (VoIP) implementation. It can be software only, moving calls around via IP, or it can have a variety of hardware interfaces to tie in with existing Time Division Multiplexing (TDM) equipment.
There are millions of Asterisk servers running telephony services around the world, [sustained by] a community of thousands of developers. The platform itself is very flexible, customizable and scalable. There are many options to build cloud servers: hosted, cluster of many servers to support large call capacity. Most organizations can get substantially more benefit with Asterisk for telephony needs as they already use other major open source software, such as Firefox (web browser), Apache (web server), Linux (OS) and Android (News - Alert) (mobile phone OS)
FK: The most important benefits of Asterisk are its affordability, robustness, scalability and feature richness. With Asterisk, a wide variety of companies now have the ability to deploy a high-quality, standards-based telephony platform at a fraction of the cost of proprietary systems. Maintenance is considerably easier and cheaper as well on account of Asterisk’s open source nature. And Asterisk’s inherent flexibility and feature richness allows it to be configured quickly according to a company’s evolving needs.
TMCnet: What speech recognition systems were available on the Asterisk platform prior to the i6net VXI*/Vestec solution and what were their benefits and downsides?
IS: Speech recognition is an important element in providing advanced voice services over telephony networks, yet the cost of speech recognition software and the difficulty of developing and integrating speech applications has been a major bottleneck for customers. The VXI*/Vestec solution addresses this bottleneck by providing a low-cost, standards-based mechanism for developing and deploying speech recognition solutions in VoiceXML.
The VXI*/Vestec solution has two integrated components for use with Asterisk:
(a) Vestec speech recognition engine
(b) i6net’s VoiceXML browser (i.e. VXI*)
Vestec’s speech recognition engine dramatically lowers the cost of speech recognition software for a wide variety of sophisticated speech applications. It is priced at under $100 per speech port, which is less than 40 percent the previously lowest priced engine for use with Asterisk. VXI* leverages the Vestec speech recognition engine by providing a low-cost, robust VoiceXML base for developing voice applications without the conventional need for coding inside the telephony platform.
VoiceXML is like HTML. It is not hosted inside the IVR but on a web server and can work with any web programming complementary language, such as Java, JSP, PHP, ASP and Perl.
The entire VoiceXML base is now accessible along with the Vectec speech recognition engine as a package instead of being an internal PBX function to launch from Asterisk Dialplan. And because it is based on an industry standards approach, it also protects the customer's investment in speech application development and gives them the additional flexibility to run their voice applications from other platforms as and when necessary.
FK: Vestec’s goal is to help popularize and commoditize speech recognition by providing robust, affordable, standards-based speech recognition software. We strongly believe that speech recognition has historically been mispriced and mismarketed on account of its premium pricing, air of exclusivity, and need for third-party professional services. As a result, only a small portion of the potential addressable market for speech – typically large firms belonging to the enterprise sector – have been able to deploy sophisticated speech applications.
The Vestec speech recognition engine makes speech software truly affordable for the first time to the vast majority of the addressable speech market. This historically underserved portion consists of SME (small medium enterprise) and SMB (small medium business) segments. Not only is the one-time licensing fee for the speech engine a fraction of that of conventional products, but the optional annual maintenance fee is also priced at a fraction of that of conventional products. Furthermore the speech engine is standards based – including in its support of grammar writing formats – thereby enabling more cost efficient deployments of existing speech applications as well as migration from more expensive speech recognition engines. No wonder Vestec speech engine has been recognized as Top-25 VoIP Advance for 2009.
As speech applications are increasingly becoming standardized over VoiceXML, we were looking for a partner who shared our vision of popularizing and commoditizing speech recognition and could contribute a robust VoiceXML browser for use with our speech engine. I6net fit the bill as its VXI* product is attractively priced and significantly simplifies the development process for sophisticated speech apps with Asterisk.
TMCnet: Outline the key features for you’re the VXI*/Vestec solution. What types of speech applications is the solution ideally suited for?
IS: Thanks to VoiceXML W3C (News - Alert) standards, VXI* / Vectec package can literally run any type of IVR service involving dynamic speech recognition. While there’s no specific speech application that it is best suited for, some examples of potential speech applications are: directory assistance, menu routing, hands-free dialing, and phone surveys.
It is important to note that most IVR services are implemented with VoiceXML. And that this trend is expected to intensify with time. According to industry analysts, 95 percent of IVR ports shipped in 2013 will support VoiceXML compared with less than 75 percent today. Clearly, the need for a robust, affordable VoiceXML-based speech recognition package has never been greater.
FK: The key attribute of VXI*/Vestec solution is its ease-of-use, robustness, and affordability. The Vestec speech recognition engine is a standards-based software while VXI* is W3C standards compliant VoiceXML browser. As such, there is no new learning curve involved for developers. Both products are also fully scalable and can handle a large number of speech ports. Finally, the VXI* / Vestec package has been desired to significantly lower speech solution costs and without a doubt offers the best deal around for enabling IVR services.
TMCnet: Compare VXI*/Vestec solution with other peer VoiceXML browsers and ASR products. What does it offer that the others do not?
IS: i6net’s VXI* is a VoiceXML browser and complements third-party speech recognition software. Compared to a number of VoiceXML browsers, it is mature, comprehensive, and affordable. Our first release was published in 2006 and today we are shipping our fifth release. It supports TTS, ASR and SIV and can be used with both voice and video phone applications. VXI* is also considerably cheaper than its peers. Generally speaking, companies can expect to save 50 percent-75 percent of VoiceXML software base costs with VXI*.
As a VoiceXML browser, i6net’s VXI* supports a variety of third-party party speech recognition engines such as Nuance (News - Alert), Loquendo, Telisma, and Lumenvox. The advantage of Vestec ASR is its value: high recognition accuracy at an affordable price.
FK: We are impressed with VXI* VoiceXML browser’s ease of use, scalability, feature richness and low price. It reduces the complexity as well as lowers the cost of developing VoiceXML based speech solutions with Asterisk and other platforms.
Regarding the Vestec speech recognition engine, it significantly lowers the cost of deploying speech recognition. We offer among the highest recognition in the industry, support industry-standard grammar writing formats, and are priced at a fraction of the cost of our competitors. To be clear, customers can enjoy significant costs savings not only in initial software licensing costs but also in (optional) annual software maintenance costs.
TMCnet: You mentioned that cost has long been a key issue with speech recognition. Do you have any predictions as to the impact on the speech recognition adoption as a result of your new solution?
IS: Without a doubt, cost is the number one issue for most companies when it comes to speech applications. And the integrated VXI* / Vestec package is squarely designed to cater to customer’s cost concerns about VoiceXML based speech recognition solutions.
We are seeing an uptick in orders from customers since launch of VXI*/Vestec package. Existing customers are finding it more affordable to expand their installed speech base while new customers view the package as a means to meet their budgets.
FK: Deployment and maintenance costs have historically been the biggest obstacle to widespread acceptance of speech solutions. That is why speech recognition has generally been limited to the top-tier of large enterprise firms. They alone could mostly afford high software costs, high platform costs, and high third-party development service costs.
We believe cost concerns among firms are even more pronounced nowadays in the wake of the unprecedented financial crisis. IT budgets have been significantly cut across the board, and on account of the slowdown in economy, there is a special focus among management on prioritizing projects according to ROI (Return on Investment) and time to market.
Given this sentiment about deployment and maintenance costs, we expect the value of the VXI*/Vestec solution to play well with customers in the market for speech recognition solutions.Brendan B. Read is TMCnet’s Senior Contributing Editor. To read more of Brendan’s articles, please visit his columnist page.
Edited by Jaclyn Allard
Article comments powered by