Mars Petcare, spanning over 50 brands, is focused on creating a better world for pets with the help of data and AI. A large volume and variety of data created challenges to gaining a complete picture of overall animal health. To harness the power of this complex data, they have built the Petcare Data Platform, leveraging Azure Databricks and Delta Lake. With actionable data insights at their fingertips, they have revolutionized pet healthcare using machine learning to diagnose and predict health issues and terminal diseases in pets.
Mars Petcare’s mission is to provide pet owners with the information they need to get a holistic view into the health of their pets. With multiple brands focused on different areas of the pet industry – from medical devices to nutrition – they have a wealth of diverse data including veterinarian notes and diagnoses, dietary information, genomics data, and more. Seizing on this opportunity, they set forth to combine all of their data to identify new ways to help improve pet health.
However, with multiple brands capturing, storing, and analyzing their own data, the engineering team faced an uphill battle leveraging the data for analytics as each business had its own source systems, data sets, processes, and models. This massive diversity in data quality and format created a lot of complexity from a data engineering perspective.
Building ETL pipelines was complex and time-consuming due to the siloed nature of their business. These disjointed business units also impacted the productivity of their analytics and data scientists. They were unable to collaborate efficiently across teams and often struggled to get the data they needed to analyze and build models.
In order to fully realize the value of their various sources of data as a whole, their data teams needed a unified approach to data analytics. With Azure Databricks, they are able to easily access their various data sources with JDBC connections, provision infrastructure to any scale without burdening their engineering team, and collaborate on the data across various data teams and business units with ease.
Databricks was the clear choice due to its seamless integration with Azure services, infinite scale, and collaborative notebooks. To improve their data management and pipelines, they rely on Delta Lake for meta-configuration and providing their analyst and data science teams with direct access to ACID-compliant data for analytics and modeling. Versioning allows the data engineers to debug issues with transaction logs and update history, and time travel enables them to restore an older version. With Delta Lake and Databricks working seamlessly together, they can visualize the metadata layer, look at transaction logs directly, and view files types and sizes — all in a single platform.
With their ETL pipelines in place, the analytics and data science teams can start working with the data more easily. Having access to versioned data allows their analysts to recreate projects easily, validate models on updated datasets, and even analyze previous versions for missing insights. The collaborative notebooks have democratized the data for various teams across business units — giving everyone the ability to access and leverage the data for their needs.
The benefits of Databricks and Delta Lake have enabled the Mars Petcare team to accelerate pet healthcare innovation. From an operational standpoint, infrastructure management is more efficient — eliminating the complexity of spinning up clusters and lowering overall costs. Additionally, features like autoscaling have helped reduce compute usage which has lowered overall cloud costs.
With data insights at their fingertips, they are revolutionizing the diagnosis of healthcare issues within pets including the prediction of terminal diseases in older cats, identifying the types of behaviors that might indicate health issues, and using DNA analysis to identify genetic health conditions within dogs.
With a single source of truth and a unified system that promotes collaboration, their data teams across all of their brands can now use data to discover new ways to improve pet health and well being.