Skip to main content

Banks Don’t Have an AI Problem – They Have a Data Platform Problem

Why AI in banking stalls and how data platforms enable scalable, governed AI in production

Blog post CBA Live recap

Published: April 17, 2026

Financial Services8 min read

Summary

  • Banks have richer customer data than almost any other institution, but fragmented systems and weak governance are preventing AI from moving beyond pilot phases into production.
  • CBA Live 2026 surfaced a consistent pattern across risk, collections, and relationship banking: the limiting factor is not AI capability, but the data and governance foundation required to support it.
  • The Databricks Lakehouse, Unity Catalog, and Agent Bricks directly address the data quality, model monitoring, real-time personalization, and agentic AI challenges banks are struggling with today.

Under the theme “Make Headway”, CBA Live 2026 brought together several hundred retail banking leaders focused on cutting through complexity and advancing innovation.

But across every session–risk, compliance, collections and deposit growth– the same underlying theme kept surfacing:

AI innovation doesn’t scale without a strong data and governance foundation.

Beneath the demos and roadmaps, a common pattern emerged. The banks making the real headway are not the ones with the flashiest AI. 

They are the ones with the cleanest, most governed, and most real-time data foundations. 

 

The Scenario That Set the Tone:

CBA President Lindsay Johnson’s keynote described a near-future consumer experience that sounded simple and inevitable.

A consumer wakes up on payday. By the time she reaches for her phone, everything is already done: bills paid, savings allocated, subscriptions renewed, even a transfer sent abroad.

No apps. No logins. No decisions to make.

An AI agent handled it all.

That’s the future banks are building toward.

But here’s the uncomfortable question that didn’t get asked on stage:

What would need to be in place within a bank for that experience to actually work?

Because this isn’t just a better digital experience. It’s a fundamentally different operating model. One where external agents interact with your systems in real time, across products, with full context, and zero tolerance for inconsistency or delay.

And for most institutions, that’s where the gap shows up.

Not in the ambition or the models they are building, but in the data foundation required to make it real.

 

What We Heard in the Sessions:

Across the various sessions, the specific data challenges varied by function, but the underlying theme was consistent.

AI Risk & Compliance: The Governance Gap is Real

Panelists from multiple institutions talked about how model drift - the silent degradation of an AI model as the real-world population it was trained on shifts - is one of the most under-appreciated risks in banking AI. A credit scoring model trained on a 750 average FICO applicant pool can fail quietly when the applicant mix shifts to 650. You need automated triggers watching for this continuously. Most banks do not have them.

The data quality discipline required for AI governance is also more demanding than many compliance teams anticipated. Internal audit now needs to independently test data lineage and not just accept business-unit attestations. The regulator will not accept "the fintech partner owns the model" as an answer.

Relationship Banking: The Richest Data in Any Industry, Sitting Idle

Multiple sessions made the same observation that banks have richer data on their customers than almost any other institution - more than a doctor, more than a financial advisor. They know about gym memberships, recurring medical payments, spending volatility, and employer deposit patterns. But most of that insight sits fragmented across systems that do not talk to each other in real time.

The friction this creates is real. One panelist described the goal of knowing a customer well enough to detect that they had not yet filed their taxes - and proactively surfacing that insight at exactly the right moment. That kind of personalization requires data that is clean, unified, and accessible in real time. It is not a product feature you can bolt on.

Default Management: Foundation Is Key

A session on AI in collections described what is possible when the data foundation is right. Predicting, with 85% accuracy, how many days a newly delinquent account will take to cure - starting from Day 1 of delinquency. That kind of early signal completely changes how you allocate collections resources.

Getting there requires not just internal account data, but the ability to stitch together digital engagement signals (did the customer visit the website without paying?), credit bureau migration data, deposit behavior, and historical resolution patterns - all in a governed, auditable way. The institutions doing this well built the data infrastructure first. The AI capability followed.

Front-Line AI: AI built on Generic Models Decays

The Bank of America session on Erica was a masterclass in what production AI actually looks like at scale. Erica has handled over 3.2 billion customer interactions since launching in 2018 and has made thousands of changes along the way. The lesson from eight years of production AI was clear that this is not a set-it-and-forget-it technology. It requires continuous data tuning, continuous monitoring, and a team whose entire job is reading the edge cases and updating the model.

Another session reinforced this from a different angle - contact center agents at most banks are toggling between 10 to 15 applications to answer a single customer question. The AI agents that will solve that problem are the ones grounded in the bank's own data. Not on generic LLMs, but tools trained on the bank's policies, products, and customer relationships.

 

REPORT

The agentic AI playbook for the enterprise

The Vendor Reality Check:

One of the most memorable sessions was a frank assessment of the AI vendor landscape. A speaker who had led AI strategy at a major institution shared a finding from a large-scale vendor audit that of several thousand vendors currently claiming AI capabilities, only around 5% have genuine AI in the product. The rest are relabeling robotic process automation or standard automation logic as AI.

The practical guidance for bank technology buyers is to get specific. Ask how the vendor built their AI capability. Ask what LLM orchestration they are using. Ask whether they have full API and MCP coverage. Ask what their business looks like in three years as workflow automation gets commoditized. If they cannot answer those questions specifically, you have your answer.

 

Why This Matters and Where We See It Working:

The themes coming out of CBA Live were not new. They closely reflect the same challenges we see in ongoing conversations with banking customers - fragmented data environments, limited governance, and AI initiatives that struggle to move beyond pilot phases into production. 

This validates a pattern that continue to surface across institutions that we engage with daily - the limiting factor is not AI capability, but the underlying data and governance foundation required to support it.

Lets connect the themes we heard to how Databricks addresses them:

The Data Foundation Problem

Banks struggle to scale AI because customer, risk, and product data are fragmented and inconsistent. The Databricks Lakehouse centralizes batch and streaming data, while Unity Catalog adds one governance layer (permissions, lineage, and classification) so every team works from the same trusted view.

With Lakeflow, banks can reliably ingest and transform data into curated layers, rather than relying on brittle, point-to-point pipelines. Lakebase then extends this foundation to transactional workloads, which brings a fully managed Postgres engine into the same governed platform, so operational apps and AI agents can share data with analytics without creating a separate, opaque OLTP estate.

The Model Drift and Monitoring Problem

Under guidance like SR 11-7, regulators now expect full-lifecycle model risk management. Not just initial validation, but continuous monitoring, drift detection, and periodic re-validation for material models.

On Databricks, MLflow and the Model Registry track experiments and approved versions, while Model Monitoring and Delta Lake capture predictions, inputs, and outcomes over time. That makes SR 11-7-style validation and ongoing performance checks a standard part of the platform, rather than a patchwork of scripts and spreadsheets. For high-impact models such as those driving delinquency predictions or fraud segmentation, these capabilities are rapidly becoming table stakes rather than “advanced” features.

The Real-Time Personalization Problem

To engage customers “in the moment,” banks need fresh, low-latency features, not just overnight aggregates. The Databricks Online Feature Store serves pre-computed features (propensity, risk flags, segments) in milliseconds, while Lakebase provides the latest operational context, such as recent transactions, within the same governance boundary.

A typical flow would look like an event (card swipe, app login, call) that triggers a decision service that reads features from the Online Feature Store, joins Lakebase context, and returns a next-best action, consistently, across channels. For front-line staff, Genie exposes the same governed data and metrics via natural language, so bankers and agents can ask questions like “What’s this customer’s 90-day deposit trend?” without tickets or ad-hoc extracts, while Unity Catalog enforces policies and lineage underneath.

The Agentic AI Problem

Agentic AI in banking means agents that can take constrained actions, such as advancing a collections workflow, kicking off KYC steps, or orchestrating service calls under strict guardrails and oversight.

On Databricks, Agent Bricks orchestrate these agents and tool calls. Databricks Apps host the secure UIs and workflows they plug into. Lakehouse + Unity Catalog controls which data agents can see, with full lineage and audit trails. The Online Feature Store gives them real-time behavioral and risk signals, and Lakebase serves as their operational state store for low-latency reads/writes, all within the same security and governance perimeter.

That lets banks scale agentic workflows on a platform that logs every action and remains explainable and auditable.

The Explainability and Compliance Problem 

Regulators care less about how “advanced” a model is and more about whether the bank can explain, govern, and evidence its use.

Databricks addresses this by making governance and lineage first-class.

Unity Catalog unifies permissions, lineage, and audit history across data, features, and model artifacts. Delta Lake and Databricks SQL provide versioned, reproducible pipelines and MLflow Model Registry + Model Monitoring capture model versions, approvals, and performance/drift over time.

That gives banks a complete, reconstructable record of how data flows, how models were built and validated, and how they influenced decisions, turning explainability and compliance into an enabler for faster, safer and responsible AI deployment.

 

Final Take:

Banks don’t have an AI problem; they have a data platform problem. 

The pattern is clear that point solutions show early promise, but without a strong, governed data foundation, they stall. The institutions seeing real results are the ones that invested in the platform first, making every AI use case faster to deploy, easier to trust, and defensible to regulators. The platform is not a follow-on decision; it’s the starting point

Questions Worth Taking Back to Your Team:

  • Do we have a single governed source of truth, or are teams working off different versions of data?
  • How quickly do we detect when a model in production goes wrong?
  • Can we explain any AI-driven decision to a regulator today, end-to-end?

If the answers aren’t clear, the next investment isn’t another use case - it’s the foundation.

  • Learn how Databricks helps banks unify data, governance, and AI at scale
  • Explore real-world banking use cases and architectures
  • Connect with our team to discuss your data platform strategy

Disclaimer: We attended CBA Live 2026 in San Diego. The observations in this post are our own, drawn from sessions attended and conversations held throughout the conference.

Never miss a Databricks post

Subscribe to our blog and get the latest posts delivered to your inbox