Hotels.com hosts millions of photos for the 325,000+ hotels on their website. Every day thousands of new photos are uploaded by properties and customers alike. These photos need to be rapidly analyzed to avoid duplicative and low-quality images and then classified (e.g. kitchen, pool, gym) so they can be logically sorted. Finally, as customers search the site, hotel recommendations need to be personalized to help customers find the perfect hotel for their needs. Achieving this requires massive compute power and advanced analytics.
Leverage machine learning to drive consumer experience: Massive volume of image files corresponding to each property listing included duplicates and lacked organization for ranking and classification. Needed to build in real-time scoring and become more efficient at deploying machine learning/deep learning models into production.
Build a more robust and faster data pipeline: On premise Hadoop cluster using SQL and SAS to do data science at scale was slow and limiting – taking 2 hours to process the data pipeline on only 10% of the data.
Increase customer conversions: Being able to understand customer trends in real-time to develop strategies to drive conversion and lifetime value.
Databricks has helped Hotels.com to realize its goal of becoming “data science focused” so that they can anticipate customer behavior and provide a more optimized user experience.
Cluster Management: Able to scale volume of data significantly without adding infrastructure complexity.
Interactive Workspace: Foster a culture of collaboration among data science teams within Hotels.com as well as other business units within Expedia.
Databricks Runtime: Increase processing performance of streaming data even at scale.
— Matt Fryer VP, Chief Data Science Officer, Hotels.com