Skip to main content Leverages Databricks to Improve Vehicle Data Quality and Customer Experience

Just-in-Time Data Platform Powered by Apache Spark Accelerates Time-to-Insight Through Agile Analytics

July 5, 2016
Share this post

SAN FRANCISCO, CA--(Marketwired - Jul 5, 2016) - Databricks, the company founded by the team that created Apache® Spark™, today announced that, the leading car information and shopping network, has implemented Databricks to improve the overall customer experience of their website. Edmunds serves nearly 20 million visitors each month and makes it easy for shoppers to browse not only dealer inventory, but also vehicle reviews, shopping tips, photos, videos, and feature stories.

Accurate vehicle data is of the utmost importance for Edmunds' website visitors. The data team at Edmunds integrates a wide spectrum of data, ranging from their proprietary data sets to paid data sources, to automatically populated details of each vehicle from its VIN code. The rapid growth in the volume and complexity of vehicle data created enormous challenges in maintaining data integrity. For example, determining the percentage of Subarus with missing option details or Hondas with incorrect exterior color were problems that the Edmunds engineering team spent hours trying to fix.

While Edmunds evaluated Apache Spark as a solution to its data challenges, the company also determined that its analysts and data professionals needed a comprehensive data platform that provided managed services to simplify its Spark deployment and increase productivity.

"We chose Databricks when we knew we wanted to move to Apache Spark because we needed advanced functions beyond the open source software to solve our analytics challenges. Databricks simplifies our data access and ingest, helps with jobs and cluster management, and enables data exploration and reporting," said Greg Rokita, executive director of technology at

With the implementation of Databricks, Edmunds was able to democratize data access across its organization, allowing its data engineering, data science, and business analyst teams to work collaboratively on their data at scale. Edmunds also achieved the following quantitative results:

  • Accelerated ad hoc data exploration and analysis by six-fold allowing the company to answer data integrity questions faster;
  • Improved reporting speed by reducing processing time by 60 percent, or an average of 3-5 hours per week for the engineering team;
  • Improved vehicle data quality metrics across its website by 35 percent.

"Apache Spark enabled Edmunds to understand the quality, quantity, and cost of data sources at scale. The power of Spark was fully realized with the implementation of the Databricks platform, enabling Edmunds data engineers and business analysts to collaborate through one tool. Now Edmunds drives better inventory and offers a more personalized experience for customers who use the platform to guide their car buying decisions," said Kavitha Mariappan, vice president of marketing at Databricks.

For more information, download the case study:
Visit Databricks at

About Databricks:

Databricks' vision is to empower anyone to easily build and deploy advanced analytics solutions. The company was founded by the team who created Apache® Spark™, a powerful open source data processing engine built for sophisticated analytics, ease of use, and speed. Databricks is the largest contributor to the open source Apache Spark project providing 10x more code than any other company. The company has also trained over 20,000 users on Apache Spark, and has the largest number of customers deploying Spark to date. Databricks provides a just-in-time data platform, to simplify data integration, real-time experimentation, and robust deployment of production applications. Databricks is venture-backed by Andreessen Horowitz and NEA. For more information, contact [email protected].

Recent Press Releases

Databricks Launches Data Intelligence Platform for Energy, Bringing Generative AI Capabilities to the Energy Sector
Read Now
Databricks Sees Over 70% Annual Growth in the ANZ Market as Enterprise AI Booms
Read Now
Databricks Launches DBRX, A New Standard for Efficient Open Source Models
Read Now
Databricks Strengthens Presence in Latin America, Appointing Marcos Grilanda as Vice President and General Manager
Read Now
Databricks Doubles Down on Investment in India Amidst Local Enterprise AI Boom
Read Now
View All



For press inquires:

[email protected]

Stay connected

Stay up to date and connect with us through our newsletter, social media channels and blog RSS feed.
Subscribe to the newsletter

Get assets

If you would like to use Databricks materials, please contact [email protected] and provide the following information:

Your name and title
Company name and location
Description of request
View brand guidelines