Data AI

Delivering Real-Time Data to Retailers with Delta Live Tables

Saurabh Shukla
Bryan Smith
Rob Saker
Sam Steiny
Share this post

Register for the Deliver Retail Insights webinar to learn more about how retailers are enabling real-time decisions with Delta Live Tables.

The pandemic has driven a rapid acceleration in omnichannel adoption. Retailers who were able to re-envision customer experiences for digital channels and deliver a variety of fast, convenient and safe alternatives for order fulfillment separated themselves from their peers during the crisis. As the virus abates, consumers are increasingly returning to physical stores while a sizable demand for the blending of online and in-store experiences remains.

For some retailers, the movement to omnichannel began well before the emergence of COVID-19. For these organizations, the pandemic moved forward timelines already in place. For others, a wait-and-see approach to buy-online pickup-in-store, curbside, home delivery and other fulfillment innovations meant a hurried bootstrapping of capabilities in the midst of lockdowns and labor disruptions. But whether an organization found itself picking up its existing pace or exiting the stands to join a race it was previously content to observe, most struggled with operational gaps and a diminished profitability in individual transactions.

Inventory challenges are becoming more visible

Well before supply chain disruptions became a topic for the evening news, retailers reported significant on-shelf availability challenges causing nearly a third of shoppers to not find an item they were looking for and retailers to miss out on nearly $1T in sales. Often detected after the fact through customer surveys and analysis of historical data, day-to-day occurrences remained somewhat hidden through customer-elected substitutions and inbound replenishment stock.

It wasn’t until the adoption of in-store fulfillment practices that sent employees to shelves on behalf of customers that the extent of the problem more fully came to light. Notifications sent by pickers informing customers of an item's unavailability provided a more timely record of gaps between reported and actual inventory. As retailers juggled substitutions and order cancellations, customers found themselves switching to retail businesses that could best meet the expectations set through online experiences.

Resolving these challenges requires transformation

In-store inventory management practices have long been a sore point for the retail industry. Large footprints supporting numerous customers and housing a sizable number of SKUs are simply difficult to keep consistently stocked. Products can be easily misplaced, left in the backroom, restocked without being properly recorded, or otherwise lost in the shuffle of in-store activity. Periodic inventory counts provide a point-in-time assessment of units on-hand, but the infrequency of these events means the current state of in-store inventory is infrequently known with any certainty.

For some retailers, the solution is to move away from in-store fulfillment toward local order fulfillment centers optimized for this kind of activity. But this option may not be viable in every market and for every retailer. Instead, many retailers are focused on transforming existing store locations to better serve a variety of fulfillment options.

Real-time insights are key

Regardless of which path is taken, the more fragmented fulfillment landscape coupled with increasing expectations for faster delivery puts pressure on retailers to improve their knowledge of what products reside where. To address this, many are turning to real-time technologies which process large volumes of inventory-relevant event data from across multiple locations to track units on-hand. This information may be used to alert store associates of an issue requiring attention, trigger automated replenishment or adjust which items are presented as shoppers navigate online platforms.

Real-time data processing technologies are not new, but with advances in technology, like the introduction of the Lakehouse architecture, the Delta Lake data format and Delta Live Tables, organizations are finding that the development of enterprise-scale real-time inventory management systems is within reach.

Technology innovations enabling real-time insights

The Lakehouse architecture breaks down the lengthy and complex logic required to transform raw event data into actionable insight. Sometimes referred to as a medallion architecture, this architectural pattern breaks the end-to-end data flow into three stages or layers referred to as the bronze, silver and gold layers (Figure 1).

 end-to-end data flow of point-of-sale data through the Databricks Retail Lakehouse architecture to calculate current inventory.
Figure 1. The end-to-end data flow of point-of-sale data through the Lakehouse architecture to calculate current inventory.

In the bronze layer, raw data received from in-store inventory management systems are persisted as-is to provide a record of the data in its original state. In the silver layer, bronze data has been deduplicated, restructured and otherwise transformed to improve their accessibility to more technical users responsible for building downstream workflows. In the gold layer, data is delivered as deliver business-aligned information assets, such as current state inventories, accessible across the organization.

This decomposition of work into digestible steps not only simplifies implementation; it also enables developers to more easily reuse information assets as they deliver various business-aligned outputs. The key challenge then becomes how to keep data in motion as it moves through the different stages so that the data objects the business consumes provide current state information.

This problem is addressed through the use of Delta Lake. Delta Lake is a variation of the highly popular Parquet data format. It preserves the performance characteristics of Parquet while enhancing it through the inclusion of a transaction log.

The transaction log allows Delta Lake to support traditional data modification patterns that greatly simplify workflow development. It also enables workflows to recognize upstream data modifications so that even though data is persisted in each of the bronze, silver and gold layers of the lakehouse architecture, end-to-end, real-time data processing can continue uninterrupted.

If the lakehouse architecture, in combination with Delta Lake, provides the foundation for a robust real-time data processing and analytics infrastructure, how exactly are the workflows defining the movement of data through that infrastructure assembled? In the past, real-time data processing took place through specialized technologies that employed novel ways of thinking about data and their interactions. With Spark Structured Streaming, real-time data processing mechanics were brought inline with batch processing to make development much simpler. And now with Delta Live Tables, the definition of persistent streaming objects, the scheduling and orchestration of data movement between spans of objects, and the resource provisioning, monitoring and alerting that surrounds real-time workflows have been simplified even further.

To learn more about Delta Live Tables, please check out this blog announcing its general availability. To see how the Lakehouse architecture, Delta Lake and Delta Live Tables can be employed together to deliver real-time insights into in-store inventories, please check out these notebooks.

The need of retailers for real-time insights has never been greater, and the analytics solutions producing these insights has never been more accessible. We hope this information and the associated notebooks help your organization deliver the functionality your organization requires to achieve its omnichannel objectives.

Register for our webinar to learn more about how retailers are enabling real-time decisions with Delta Live Tables.

Try Databricks for free

Related posts

Engineering blog

Real-time Point-of-Sale Analytics With a Data Lakehouse

September 9, 2021 by Bryan Smith and Rob Saker in Engineering Blog
Disruptions in the supply chain – from reduced product supply and diminished warehouse capacity – coupled with rapidly shifting consumer expectations for seamless omn...
Data AI

Insights into Accelerating Retail’s Data and AI ROI

Register for the Insights into Accelerating Retail's Data and AI ROI virtual event. The global pandemic has accelerated trends in retail in an...
See all Data + AI Blog posts