Key Retail & Consumer Goods Takeaways From Data + AI Summit 2022

The most important announcements for Retailers & Consumer Goods companies from Data + AI Summit 2022

Retail and Consumer Goods companies showed up big at Data + AI Summit this year! With incredible breakout sessions to a keynote and panel of top retail speakers like the VP of Ads Engineering at Instacart and the Global CTO of Walgreens, we heard about innovation with data and AI like we never have before.

In case you missed the live event, I’m excited to share product announcements, highlights of the industry program, and on-demand sessions on our virtual platform. These sessions, which featured industry experts and technologists from Databricks, our customers, and partners, showcase why the Lakehouse for Retail & Consumer Goods is a key component for organizations looking to modernize their data strategy. As you’ll discover from the sessions, a lakehouse is especially critical for ensuring greater stability in this time of uncertainty and for delivering more personalized, onmichannel experiences to customers.

Retail & Consumer Goods Forum

For our Retail & Consumer Goods attendees, the most exciting part of Data + AI Summit 2022 was the Retail & Consumer Goods Forum – a two-hour event that brought together leaders from across all segments of retail and consumer goods to hear from peers about their data journey. Bryan Smith, Databricks Retail & Consumer Goods Global Technical Director, shared an overview of the lakehouse and how it enables RCG companies to leverage real-time data to transform the customer experience.

In a keynote from Instacart Ad’s Vice President of Engineering, Vik Gupta, attendees learned about the rise of retail media networks and the importance of performance measurement. Retail ad networks are a very hot topic with companies right now. Vik shared how ad platforms, like Instacart Ads, benefit retail and consumer goods brands by unlocking new digital monetization capabilities, data, and insights on shopping behavior.

In a panel moderated by Sam Steiny, 84.51’s VP of Engineering, Nick Hamilton, Shipt’s Director of Engineering, Barry Ralston, PetSmart’s VP of Analytics & Insights, Elpida Ormanidou and Walgreen’s Global CTO, Mike Maresca, attendees learned best practices for achieving business outcomes with data + AI regaring people, process, and technology.

For example, in reference to recent market volatility, Nick Hamilton at 84.51 shared the imporance of being “both proactive and reactive.” For 84.51, this means improving their data science ahead of time so the data team can proactively respond to changes and have the right product on the shelves while maintaining the ability to make changes in models when needed.

Barry from Shipt discussed the importance of real-time data in the delievery business. In Barry’s words: before moving to Databricks, the “the lag from our operational systems into our cloud data warehouse platform was on the order of 5 to 6 minutes, which was a lifetime.” Shipt has gotten that lag down to around 16 seconds and, most importantly, now has the ability to tune performance based on the business needs.

Industry Sessions

The event had a number of incredible Retail & Consumer Goods breakout sessions featuring companies from all segments of the industry. All sessions are now available on our virtual platform. Here are few you don’t want to miss:

DoorDash – Building a Lakehouse for Data Science at DoorDash
Wehkamp – Powering Up the Business with a Lakehouse
Anheuser-Busch InBev – Building and Scaling Machine Learning-Based Products in the World’s Largest Brewery
Walmart – Intermittent Demand Forecasting in Scale Using Meta Modeling
Levi Strauss & Co – A Vision for the Future with Edge ML-Powered Devices

Key Announcements That Will Transform Retail & Consumer Goods

While much has been written about the innovations shared by Databricks at this year’s Data + AI Summit, I thought I would provide a quick recap of our news and why it’s particularly exciting to our retail & consumer goods customers:

Delta Lake is now fully open source. Delta Lake is the fastest, most popular, and advanced open format table storage format. With the release candidacy of Delta Lake 2.0, Databricks is now open sourcing the most requested features by the community. This means that features that were available in the past to Databricks customers only will be available to all of the Delta Lake community.

When we talk with Retail and Consumer Goods companies, the theme they constantly stress to us is:”when we make the decision to use proprietary technologies or clouds, it always works against us.” We agree. The announcements from Data + AI Summit go a long way in reaffirming our support for technology that helps avoid vendor lock-in, allows them to benefit from enhancements from the open-source community, and give them flexibility in the partners they integrate to their Lakehouses.

And when it comes to performance, Delta Lake continues to provide unrivaled, out-of-the-box price-performance for all lakehouse workloads from streaming to batch processing — up to 4.3x faster compared to other storage layers.

Many RCG companies use and contribute to Delta Lake, including Apple & Columbia Sportswear.

What does this mean for our RCG customers? To meet the evolving demands in the space, organizations can now ensure:

  • Data is available to support decisions in real-time.
  • Companies can store, manage and govern all types of data in their object store.
  • Development and management is streamlined with code managed via CI/CD, living in your GitHub repository, and using MLflow (VERSION) to streamline the MLOps process.
  • Companies have optionality in the partners they have integrating with their Lakehouse, with applications leveraging open source APIs
  • There is no vendor or cloud lock-in.
  • Databricks continues to make major investments in the platform to ensure that it maximizes productivity and provides the best TCO of any data + AI platform.

Delta Live Tables has new performance and efficiency features.
One of the biggest hurdles in enabling data, reporting or analysis is buidling pipelines to ingest and transform data. Delta Live Tables was designed to streamline this development process, while delivering high performance and easier manageability.

Delta Live Tables has grown to power production ETL use cases at leading companies all over the world since its inception. DLT is used by over 1,000 companies ranging from startups to enterprises, including retail and consumer goods comapnies like Jumbo, the Netherlands-based super market chain. You can read about all the latest DLT enhancements in this blog post, but here are some key highlights:

Project Enzyme is a new optimization layer for Delta Live Tables that speeds up ETL processing, enables enterprise capabilities, and UX improvements.

Enhanced Autoscaling optimizes cluster utilization by automatically allocating cluster resources based on workload volume, with minimal impact to the data processing latency of your pipelines – reducing usage and cost for customers.

With volatility in the market and narrowing margins, speed and cost are of the utmost importance in retail and consumer goods organizations. Data teams need fresh and reliable data urgently and without breaking the bank in order to make real-time decisions. These two capabilities will help them do that.

MLfLow Pipelines makes model development & deployment fast and scalable
MLflow Pipelines enable Data Scientists to create production-grade ML pipelines that combine modular ML code with software engineering best practices to make model development and deployment fast and scalable.

As RCG companies begin using ML to uncover new revenue streams or better understand their customers, MLflow Pipelines is incredibly valuable due to its ease of use and integration with a proven system. Most companies struggle to achieve the scale that is required for effective AI. Databricks makes AI at scale possible and MLflow Pipelines is a critical part of that.

Retail customers love machine learning on the Lakehouse – check out this breakout session to learn how Starbucks is using ML on Databricks for their recommendation engine across 30,000 stores.

Delta Sharing is GA soon with Clean Rooms capabilties to follow. One of the biggest challenges for Retail, Consumer Goods and other companies in the Retail value chain is how they efficiently share information in real-time. Existing solutions often require costly integration and support, or require both parties license the same proprietary data sharing technologies. This limits data sharing only to the largest of companies. And these methods operate in batch, leading to days of delay in responding to business needs.

Delta Sharing is an open source protocol that enables the secure sharing of data with partners across technologies and clouds. Databricks customers can share data from Azure to AWS or from Databricks to a large number of Delta Sharing compliant technologies. Delta Sharing is built on top of the real-time performance of Delta and the robust management and governance of Unity Catalog. Retailers can provide real-time visibility to conditions in their stores, enabling distributors, suppliers and other partners to cut days out of responding to conditions such as out-of-stocks. It promises to unleash secure, real-time collaboration like never before.

At Data + AI Summit, we announced Data Cleanrooms (coming soon). Retailers want to share consumer data with partners, but they want to respect the privacy wishes and regulatory requirements in doing this. This is what Data Cleanrooms are designed to enable.

Data Cleanrooms opens a broad array of use cases for retail and consumer goods companies, such as enriching retail loyalty data with consumer behaviors in advertising or other channels. Consumer packaged goods (CPG) companies can see sales uplift by consumer segments by joining their first-party advertisement data with point of sale (POS) transactional data of their retail partners.

Check out our Retail & Consumer Goods Forum where Nick, VP of Engineering from 84.51, talked about how he sees Clean Rooms as the biggest upcoming trend in the RCG industry – among other exciting industry topics.

Beyond these featured announcements, there were other exciting announcements like Databricks Marketplace, Unity Catalog and Serverles Model Endpoints. We encourage you to check out the Day 1 and Day 2 Keynotes to learn more about our product announcements!

Try Databricks for free Get started

Sign up