Skip to main content

How Databricks Managed Disaster Recovery Helps Capital One Achieve Lakehouse Resilience

Powering Mission Critical Workloads on the Databricks Lakehouse

Published: October 30, 2025

Company3 min read

Summary

  • Databricks has been partnering with Capital One to co-develop a managed Disaster Recovery solution for the lakehouse
  • During a vendor outage, this solution proved to be a success
  • Capital One was able to quickly fail over to a secondary region and resume interactive analytics in their Databricks platform

For over a year, Databricks has been partnering with Capital One to build a managed Disaster Recovery offering for the lakehouse. During a vendor outage, this collaboration paid off when Capital One was able to quickly fail over interactive analytics in their Databricks platform to a secondary region.

Technology outages are a matter of “when”, not “if”. The Capital One and Databricks partnership shows that with a solid data platform, guided by a robust disaster recovery strategy, even large-scale outages can be weathered with minimal disruption to the business.

Partnering with Databricks on multi-region resilience helps keep critical analytical workloads working through events like a regional disruption. —Shehzad Mevawalla, Executive Vice President of Enterprise Data Technology, Capital One

Disaster Recovery for the Open Data Lakehouse

Traditional backup tools only protect data. Traditional data warehouses may only protect data in their own proprietary format.

In contrast, the modern open data lakehouse is much more. It includes:

  • Data are stored in open data formats, in storage that customers control
  • An open data catalog, which serves as the center of governance
  • Customer-defined assets, such as notebooks and pipelines, across potentially thousands of users

It is critical that all of these components be resilient to failure and can seamlessly resume operations in case of regional failures.

When it comes to disaster recovery, this presents a new set of challenges, including the ability to replicate all mission-critical elements of the lakehouse to a secondary cloud region with low latency and across a wide variety of asset types.

Databricks Managed Disaster Recovery: Lakehouse DR Solution

In collaboration with Capital One, Databricks has developed a managed Disaster Recovery solution to help tackle these challenges. It includes:

  • Managed replication - with performant background compute, critical workspace assets can be quickly replicated to your secondary region using Databricks' out of the box capabilities.
  • Customer-specified failover - Databricks’ managed solution provides customers the flexibility to fail over to the secondary region at a time of their choosing. This allows customers full control over the failover and failback process - a must when failover often requires coordinating across teams, systems and tools.
  • Read-only secondary - Databricks can easily enforce that the secondary failover region be read-only until the point when it is promoted to become the primary. This ensures that all writes go to the primary region at any time, and prevents unintentional writes to the secondary.

Learnings and Next Steps

This successful collaboration highlights a few key ingredients for mission-critical workloads:

  • Ongoing commitment to resiliency - Capital One’s commitment to regular failover and failback exercises ensured that failover is a part of the organization’s muscle memory when an outage occurred.
  • Leave the heavy lifting to a managed solution - Capital One is able to leverage Databricks’ managed DR solution in order to perform replication at scale so their teams can focus on higher-leverage work.

Capital One continues to push the envelope when it comes to cloud resiliency - expanding their Disaster Recovery coverage and pushing to reduce their Recovery Time Objective even lower.

Databricks is planning further improvements to the Managed Disaster Recovery solution, building on the lessons learned from past outages. Stay tuned for more details.

Never miss a Databricks post

Subscribe to our blog and get the latest posts delivered to your inbox