Data AI

The Lakehouse for Retail

Reimagining the future of retail with data + AI
Share this post

Every morning, as people are just beginning to rise, the business of retail is already in full motion. Delivery trucks are beginning their routes to bring goods to stores and millions of homes. Managers are preparing to open their stores and store associates are checking their departments to make sure they’re stocked to meet the demands of the day. Retail operates 24/7, but the past few years have changed the industry.

The global pandemic has accelerated trends in retail, in some instances by a decade. The pandemic compelled consumers – en masse – to shift their expectations more rapidly and completely than during any other time in history. Physical retail remains important, but retailers have had to learn how to adapt and enhance the shopper experience across the omnichannel. They’ve responded with accelerated investments in technology, but now are looking at how they can optimize their operations to improve profitability.

Databricks works with the world’s leading retailers across all channels and geographies on these challenges, including Walgreens, Columbia, Acosta, H&M Group, Reckitt, Restaurant Brands International, 84.51°(a subsidiary of Kroger Co.), Co-Op Food, Gousto, Wehkamp and more. Every day, Databricks retail customers power billions of customer interactions with the power of the Lakehouse for Retail.

What’s changed

The last two years saw rapid transformation in the industry, but as time has passed, we’re seeing the stickiness of these changes. Led by e-commerce’s click and collect simplicity, e-commerce penetration rose from 8% in March 2020 to ~14% a year later. Once retailers worked extra time to drive consumers into their brick and mortar stores, but now third-party delivery services are blurring the visibility of the consumer behaviors to the retailer. Online delivery is unprofitable in many instances, but retailers have viewed this moment as a way to protect or gain market share in the near term.

As economies have reopened, we’ve seen a new challenge, which is driven by the instability of the retail supply chains. Waiting times for berths at global ports to unload have doubled. Labor shortages are causing shortages for trucking and rail companies to pick up containers from ports, creating bottlenecks in the supply chain. Abnormally high inventory levels, combined with tight capacity and unseasonably high price growth, are the drivers behind the continued tightness in warehouse availability.

Top retail investment priorities in data + AI

In the face of these overwhelming disruptions, retailers rapidly responded with many emergency measures, but we’re now seeing retailers look beyond the initial response at more sustainable operations with a strong increase in investment in data + AI, focusing in several areas:

Driving real-time decisions with data

The meteoric rise of e-commerce has put pressure on brick and mortar stores to improve their end-to-end operations. This begins by improving the speed at which decisions are made. With order fulfillment costs rising, the difference between five minutes and 5 seconds can be the difference in profit and loss.

Retailers are responding by making real-time point of sale, e-commerce, mobile application, distribution and loyalty data available to power a holistic picture of their operations. They are using this real-time data to improve perpetual inventory calculations, consolidate order picking, estimate delivery costs, and provide more timely and relevant recommendations to shoppers.

Reimagining the relationship with consumers

Retail has led the charge of shopper insights, loyalty programs and personalized recommendations and offers over the past several decades. Current efforts build on this foundation but are driving greater precision with real-time insights, and using a range of new types of data to understand why purchasing decisions are made. This level of customer understanding is being realized through smarter segmentations and personalized recommendations, but it’s also helping retailers abate the surge in returns by providing smarter suggestions on sizes and items based on previous purchases.

But retailers haven’t stopped with merely outbound promotions and personalization to customers. They’re moving away from push methods of distribution and operations, and beginning to use shopper behaviors as “pull” signals to optimize their business. Understanding how shoppers behave is helping retailers drive much higher incremental revenue and margin improvement through localized assortments of products and sizes, improved staffing levels, merchandising and more.

Improving collaboration with partners to improve profitability

The pandemic exposed the fragility of the global supply chain. It’s not sufficient to just improve operations within stores, retailers need to improve coordination of activities with the thousands of partners in the value chain.

Retailers are investing in improved demand sensing, on-shelf availability, and forecasting analytics and exposing these analytics directly with suppliers, distributors, brokers and delivery partners. Real-time data sharing and collaboration are core to this shift as companies attempt to reduce the amount of time it takes to respond to needs.

Introduction to the Lakehouse for Retail

At Databricks, we understand retail and are committed to helping companies overcome these challenges to realize the full potential of their data and AI investments. The Lakehouse for Retail brings together disparate data sources, paired with best-in-class data and AI processing capabilities, and surrounds this with an ecosystem of retail-specific solution accelerators and partners. Retailers can take advantage of the full power of all their data and deliver powerful real-time decisions.

The Lakehouse for Retail is designed to give retailers the flexibility to adopt the capabilities they need to address their most pressing business needs – from driving real-time decisions to powering better experiences with shoppers to improving collaboration across the value chain and more. Here are some of the unique Lakehouse-driven use cases and benefits that can help retail data teams transform how they leverage data across sources and types:

Power real-time decisions with data. The Lakehouse for Retail enables companies to both rapidly ingest data at scale and make insights available across the value chain in real-time. Speed is the antidote to business volatility, and companies are using the Lakehouse to power real-time operations with data.

The Lakehouse for Retail delivers on the promise of real-time data, with the maturity that businesses demand from modern data platforms. Delta Lake simplifies the change data capture process while providing ACID transactions, scalable metadata handling, and unified streaming and batch data processing. Delta Lake also supports versioning, enables rollbacks, full historical audit trails, and reproducible machine learning experiments.

Improve the accuracy of decisions. The Lakehouse uses technologies to scale analysis, enabling companies to perform the largest analytics while meeting service windows. Companies no longer need to sacrifice accuracy or breadth of analysis to meet service levels. The Lakehouse allows companies to scale their analytics to the largest of jobs and deliver highly accurate analytics while meeting operational needs by use of all types of data.

Use all types of data. Only 5-10% of a company’s data is structured. Tapping into the other 90% of data helps businesses better understand the environment around them, and make better decisions. The Lakehouse for Retail has native support for all types of data, structured and unstructured like images and video, which allows companies to make better-informed decisions.

Inexpensive and open collaboration. Retailers need to collaborate with their partners in real-time, but existing data sharing technologies are expensive and often require that all parties invest in the same proprietary technology. The Lakehouse for Retail leverages Delta Sharing to provide an open and secure method of data collaboration and sharing for companies. This inexpensive approach unlocks the power of collaboration with all partners in the value chain.

Partner ecosystem

Partners that deliver pre-built solutions and platforms provide retailers a faster and proven path from digital transformation ideation and innovation to AI ROI.

The leading consulting firms in retail have built practices around the Lakehouse for Retail. We’ve partnered with Deloitte and Tredence to educate thousands of their employees on the Lakehouse platform and are increasing our investment in partners to help bring native Lakehouse solutions to their customers. These partners have developed pre-built solutions that provide retailers with a faster and proven path to value.

Industry data sharing & collaboration

The Retail & Consumer Goods value chain has always been collaborative, but it has been limited to companies that can afford expensive and closed systems for integration. Out of the thousands of suppliers that call on a retailer, a small percentage can afford to invest in these proprietary systems. Those existing systems are also limited in what types of data and how often they can share data. Most are limited to structured data, and many limit data exchange to slow batch processes.

At the core of the Lakehouse for Retail is a new, inexpensive and open method of data sharing and collaboration that opens interaction and innovation to all partners in the value chain. Built on the open-source Delta Sharing technology, data sharing and collaboration with the Lakehouse for Retail:

  • Does not require that all companies invest in the same technology. Companies can use Databricks in addition to a vast ecosystem of technology partners that support Delta Sharing.
  • Provides fine-grained controls for sharing of data with the use of Unity Catalog.
  • Allows companies to share data in near real-time, enabling partners across the value chain to improve their responsiveness to changes in the business.

Tools to help companies accelerate

To help companies quickly realize value from their investment in data and AI, Databricks has invested in the creation of over 20+ Retail Solution Accelerators made freely available to customers.

Solution Accelerators are fully-functional, proven capabilities that help companies quickly prove the feasibility of solving a problem with data and AI. Companies can use these Solution Accelerators to quickly complete a pilot on a business problem, and then use that as a foundation to complete an MVP and full solution. Solution Accelerators have been used by hundreds of companies to build the core of critical use cases – ranging from Demand Forecasting to Personalized Recommendations to On-shelf Availability. These use cases can help customers save anywhere from 25-50% of their development efforts.

Lakehouse for Retail is addressing challenges that retail has long tried to crack - but struggled due to limits in the capability of technology. Operating a real-time business opens up possibilities for use cases like never before in demand planning, delivery time estimation, personalization or consumer segmentation. Decisions that could take hours, now can be made in seconds, which for many companies can mean a difference between profit or loss. Combined with a customer success program, one of the largest open source communities supporting the underlying technologies, and a value assessment program that helps identify where and how to start on your digital transformation journey, Databricks is poised to help you become a leader in retail through a data-driven business.

Want learn more about Lakehouse for Retail? Click here for our solutions page, or here for an in-depth ebook. Retail will never be the same now that Lakehouse for Retail is here.

Try Databricks for free
See all Data + AI Blog posts