Skip to main content

Voicebox Accelerates Voice Recognition Innovations with Databricks’ Unified Analytics Platform

Databricks Simplifies Data Engineering Processes, Shortens Iterative Cycles for Core Technologies

March 26, 2018
Share this post

San Francisco, Calif., March 26, 2018 – Databricks, provider of the leading Unified Analytics Platform powered by Apache Spark™, today announced that, the voice technology supplier for the automotive, mobile, home and IoT markets, has selected the Databricks Unified Analytics Platform to reduce time to market of their voice recognition technology. Since incorporating Databricks’ Unified Analytics Platform in 2016, Voicebox’s natural language technology has been used to process over a billion spoken statements per month.

Register to hear more about Voicebox’s experience with the Databricks Unified Analytics Platform in a webinar taking place Wednesday, March 28 at 10:00 am Pacific Time: Data Contributions for a Conversational AI Platform.

Voicebox is the leader in conversational Artificial Intelligence (AI) development, building award-winning voice applications for over fifteen years. The company leverages patented context management algorithms to model human conversation to go beyond the current one-question, one-answer paradigm. The technology includes components for automated speech recognition (ASR), natural language understanding (NLU), and text-to-speech (TTS). Voicebox also supplies tools for authoring conversational domains and a set of domain showcases that use code and UX examples to demonstrate how to leverage advanced ASR and NLU capabilities to build exceptional conversational assistants.

For example, when a Voicebox customer says, “Play the latest song from Coldplay,” the ASR component must be able to map the spoken phonemes to the word “Coldplay”, while the NLU capabilities map “Coldplay” to an appropriate database query or ID value. Maintaining this capability in real-time becomes very complex when the data is always changing, such as when new artists, albums, and songs are released. Customers don’t like waiting for lengthy updates to the company’s machine-learned ASR and NLU models. Prior to Databricks, Voicebox’s entity extraction pipeline used a custom in-house solution for cleaning the data, training models, and delivering them to customers. This in-house solution was difficult to maintain and often error-prone.

With the use of the Unified Analytics Platform, Voicebox is able to build, schedule, and run automated production data pipelines that keep their ASR and NLU deep learning systems up-to-date. The company’s use of Databricks to build a performant data processing pipeline also enables VoiceBox to measure more precisely the latency of each intermediate step before the end-user sees results. As a result, VoiceBox is able to operate at higher peak traffic volumes without sacrificing latency. Voicebox’s cloud platform captures anonymized audio recordings, then uses Databricks’ unified analytics in its crowdsourcing pipeline that uses external and internal crowds to evaluate accuracy. Thanks to CIP (Continuous Improvement Processes) and Databricks, these efforts have increased end-customer facing ROI metrics such as ASR accuracy and end-to-end accuracy between 14 percent and 24 percent since the program began; NLU accuracy has increased over 3 percent.

“Our engineering team functions like night and day with the Databricks platform in terms reliability and speed,” said Peyvand Khademi, director of Data Platform and Services at Voicebox. “The Unified Analytics Platform has reduced engineering complexity, facilitated a streamlined workflow, and enabled a shorter development cycle for our core AI voice recognition technologies, products, and solutions. It would be entirely fair to say that we have overhauled our engineering processes in the Voicebox Data Team since switching to Databricks.”

About Databricks

Databricks’ mission is to accelerate innovation for its customers by unifying Data Science, Engineering and Business. Databricks' founders started the Spark research project at UC Berkeley that later became Apache Spark. Databricks provides a Unified Analytics Platform powered by Apache Spark for data science teams to collaborate with data engineering and lines of business to build data products. Users achieve faster time-to-value with Databricks by creating analytic workflows that go from ETL and interactive exploration to production. The company also makes it easier for its users to focus on their data by providing a fully managed, scalable, and secure cloud infrastructure that reduces operational complexity and total cost of ownership. Databricks, venture-backed by Andreessen Horowitz, NEA and Battery Ventures, among others, has a global customer base that includes Viacom, Shell and HP. For more information, visit

Apache, Apache Spark and Spark are trademarks of the Apache Software Foundation.

About Voicebox

Voicebox is an acknowledged leader in Conversational AI, including Voice Recognition (VR), Natural Language Understanding (NLU), and AI services.  In addition to its patented Conversational Voice Technology, Voicebox pioneered the Connected Device trend with its strengths in embedded, server, and hybrid operation, earning a spot on IEEE’s list of World’s Most Impactful technology. The Voicebox team comes from over 25 countries and possesses unequaled breadth and expertise. From its 2001 founding in Bellevue, WA, the company has grown to include offices in Europe and Asia.  For more information visit

Media Contact:

Kristalle Cooks

P: 650-346-7810

E: [email protected]

Recent Press Releases

Databricks Strengthens Presence in Korea with Senior Leadership Hires
Read Now
Introducing Databricks LakeFlow: A Unified, Intelligent Solution for Data Engineering
Read Now
Databricks Open Sources Unity Catalog, Creating the Industry's Only Universal Catalog for Data and AI
Read Now
Introducing Databricks AI/BI: Intelligent Analytics for Real-World Data
Read Now
Databricks Unveils New Mosaic AI Capabilities to Help Customers Build Production-Quality AI Systems and Applications
Read Now
View All