Skip to main content
Capital One

CUSTOMER
STORY

Transforming data pipelines to power faster credit decisions

60X

Faster compute performance enabling real-time iteration

80%

Decrease in time and cost per job

Power credit decisions

With unified data and accelerated performance

customer Capital One still image

Product descriptions:

As a data-driven company, high-quality and well-managed data is essential to how Capital One solves industry problems and delivers value to millions of customers. Its credit line increase program relies on intelligent and trusted data and insights to enable financial empowerment for its customers. But growing data volumes and complex pipelines led Capital One to establish a new centralized feature hub to further streamline data management and enhance performance. With Databricks, Capital One was able to unify data pipelines, accelerate data processing and deliver actionable insights for more timely credit decisions.

Standardizing features on Databricks

To overcome these constraints, Capital One’s data and engineering teams built a centralized Feature Hub on Databricks. With Photon as the compute engine and Delta Lake as the foundation, the hub became the single environment for both historical and operational feature pipelines. It now curates records across millions of accounts, providing a 360-degree view of customer insights that supports both modeling and daily decisioning.

The ability to run large-scale backfills was transformational. As Animesh Mod, Senior Manager of Software Engineering at Capital One, described, “We consistently see Databricks performing 60x faster than other available systems.”

Automation was key to multiplying efficiency. Engineers used Databricks APIs to automate job triggers, monitoring, and reruns, while autoscaling clusters matched resources to workloads for optimal spend and performance. As a result, developing and deploying new model features only takes weeks, including full historical backfills.

Raquel Goosey, Data Engineering Manager at Capital One emphasized, “Once we have the logic for a feature that needs to be delivered into the production environment, we can streamline the process and get it shipped in a much shorter timeframe than previous iterations of technology.”

Faster performance, lower costs, better decisions

The Databricks Platform delivered transformative results. Capital One improved job completion speeds by 60X, which has the potential to unlock significant efficiency gains as workloads scale.

The benefits also carried over to daily operations. By migrating production pipelines to Databricks, the team realized the same performance gains and cost efficiencies in real-world workloads. “We moved our execution from the alternate compute into Databricks for our operational use case. As a result, we have seen an 80% decrease in time and cost per job,” noted Animesh. These savings provided budget flexibility to reinvest in innovation while maintaining consistency and reliability across all data products.

A unified platform further enables Capital One to use trustworthy and timely data to drive insights and key decisions for the business and its customers. The results reflect not just efficiency gains but also enhanced data management. With faster analytics, lower costs, and more reliable features, the bank can continue to deliver superior products and services that make a real difference in millions of peoples’ lives.

Explore more