Skip to main content


We are happy to announce that has deployed Databricks to simplify the management of their Apache Spark clusters and perform ad-hoc analysis to improve vehicle data integrity and improve the overall customer experience of their website., a leading car information and shopping network that serves nearly 20 million visitors each month, allows shoppers to browse dealer inventory, vehicle reviews, shopping tips, photos, videos, and feature stories.

To ensure shopper satisfaction, accurate vehicle data is of utmost importance. solves their data quality issues on vehicle listing pages by matching a car’s VIN (vehicle identification number) against OEM (original equipment manufacturer) and Edmunds codes to identify critical information about the vehicle, such as the country it was built, vehicle year, and more. If done accurately, providing this kind of detailed vehicle information makes extremely valuable in a shopper’s vehicle buying process.

Over the past couple years,’s data volumes have grown tenfold from 10’s to 100’s of TB, making it increasingly difficult to accurately decode each VINs and match them to the right vehicle feature codes — resulting in missing or inaccurate details which impacted the customer experience. For example, determining what percentage of Subarus are missing the options details or how many of their Hondas do not have their exterior color described are some of the problems that the engineering team was trying to fix.

To solve this data integrity problem, looked to Apache Spark for processing speed at scale. However, they realized that in order for their analysts and data professionals to focus on the data and the business simultaneously, they needed a comprehensive data platform that provided managed services to simplify their Spark deployment and increase their productivity.

With the implementation of Databricks, was able to democratize data access across their organization, allowing its data engineering, data science, and business analyst teams to work collaboratively on the data at scale. also achieved the following quantitative results:

  • Accelerated ad hoc data exploration and analysis by six-fold allowing them to answer data integrity questions faster;
  • Improved reporting speed by reducing processing time by 60 percent, or an average of 3-5 hours per week for the engineering team;
  • Improved vehicle data quality metrics across their website by 35 percent.

Download this case study to learn more about how is using Databricks.

To try out Databricks for yourself, sign up today!

Try Databricks for free

Related posts

See all Company Blog posts