Skip to main content
Industries header

This is a guest-authored post by Jake Stone, Senior Manager, Business Analytics at ButcherBox

The impact of a legacy data warehouse on business speed and agility

From the outside, the ButcherBox concept is simple – subscribe to our service and each month, receive a shipment of fresh meat and seafood that checks all the boxes: organic, grass-fed, free-range, crate-free, wild-caught, etc. But peel back the curtain, and you’ll find that the day-to-day demands on the team in charge of a monthly subscription-based model is a lot trickier than it seems.

As a young e-commerce company, ButcherBox has to be nimble as our customers' needs change, which means we’re constantly considering behavioral patterns, distribution center efficiency, a growing list of marketing and communication channels, order processing systems— the list goes on.

With such intricate processes at our foundation, and so much data feeding in from different sources — from email systems to our website — the data team here at ButcherBox quickly discovered that data silos were a significant problem and blocked complete visibility into critical insights needed to make strategic and marketing decisions. Our data team also struggled to deliver reports and accurate insights in a timely manner. We knew we needed to migrate from our legacy data warehouse environment to a data analytics platform that would unify our data and make it easily accessible for quick analysis to improve supply chain operations, forecast demand, and, most importantly, keep up with our growing customer base.

True data visibility in the lakehouse fuels better decision making

Once ButcherBox deployed the Databricks Data Intelligence Platform on Azure, analyzing data for business optimization was a breeze. Now, with direct visibility into all of our diverse range of data (e.g., customer, inventory, marketing impact, etc.) and granular permissions, our analytics team can safely and securely view the data as it comes in and export it in the format they need to make smarter decisions.

Additionally, Delta Lake provides us with a single source of truth for all of our data. Now our data engineers are able to build reliable data pipelines that thread the needle on key topics, such as inventory management, allowing us to identify in near real-time what our trends are so we can figure out how to effectively move inventory.

Databricks SQL has empowered our team to modernize our data warehousing capabilities to rapidly analyze data at scale without worrying about infrastructure, performance or data quality issues. It provides a simple yet powerful enterprise-level environment for data warehousing on the lakehouse, with a strong focus on visualization tools, which is a lot less intimidating than most analytics solutions and great for everyone— even those who only know SQL.

With data at our fingertips, we are much more confident knowing that we are using the most recent and complete data to feed our Power BI dashboards and reports. In many cases, we’ll also use Databricks SQL visualizations that have proven to be more flexible for the data team.

But key to our ability to get the most out of our data is fueled by the collaborative nature of the Databricks platform. Now, analysts are able to share their builds with whoever else has access to the platform, and collaborators can get in and add to the project without having to get into the code itself, making teamwork from the analyst over to the business user much more streamlined. Now we can generate insightful dashboards on the fly and share them with internal teams and work together to help move the business forward.

Looking ahead, we want to continue to democratize our data strategy across the company. In fact, we have established an analytics COE or Center of Excellence, and Databricks is at the core of that initiative. Our goal is to ensure every one of our analysts and business partners can access the data they need and start being effective as soon as they log on. Our only limiting factor is how quickly we can write SQL queries as opposed to how much time it would take to build out a dashboard and detail it.

Knowing your data means knowing your customers

Presently, ButcherBox has hundreds of thousands of subscribers. While this would have been too much data to dig through in the past, thanks to Databricks, we are able to effectively comb through these massive data sets. Being able to query a table of 18 billion rows would have been problematic with a traditional platform. With Databricks, we can do it in 3 minutes.

Now, with a much better window into who our customers are and what they want, it’s like we know each and every one of them personally. For example, we previously ran into various logistical and delivery issues that cost the company tens of thousands of dollars. With Databricks, we were able to access the data and apply advanced analytics to determine how to address the issues 10x faster, which has enabled us to explore significantly more ways to leverage data to solve complex business challenges.

We wouldn't be where we are today without Databricks and the business value it provides. Simply put, the view the Databricks platform has given us has fundamentally changed the way we think about our member base and their behavior. It has fundamentally changed how we do business for the better.

Try Databricks for free

Related posts

Company blog

Analytics on the Data Lake With Tableau and the Lakehouse Architecture

Get an early preview of O'Reilly's new ebook for the step-by-step guidance you need to start using Delta Lake. Over the past two...
Company blog

Driving Transformation at Northwestern Mutual (Insights Platform) by Moving Towards a Scalable, Open Lakehouse Architecture

July 15, 2021 by Madhu Kotian in Company Blog
This is a guest authored post by Madhu Kotian, Vice President of Engineering (Investment Products Data, CRM, Apps and Reporting) at Northwestern Mutual...
See all Industries posts