How the Databricks Data Intelligence Platform unifies siloed AML systems, ML risk scoring, and a fleet of AI agents into one governed workflow: from alert to filed SAR.
by Kateryna Savchyn , Pavithra Rao, Mimi Park and Emerson Bayuk
The anti-money laundering (AML) function in financial services has historically been organized around two responsibilities: clearing alerts of potential money-laundering activity and documenting the disposition of every case, including filing Suspicious Activity Reports (SARs) when warranted, all while sustaining program effectiveness and process auditability. That model is now under pressure. Evolving financial-crime typologies, regulatory expectations for real-time explainability, and the maturity of generative AI are reshaping what a modern AML practice looks like. AML leaders are increasingly expected to direct analyst time toward genuine financial-crime intelligence rather than the data-gathering, false-positive triage, and narrative drafting that dominate workloads today.
The constraint is rarely talent or intent. It is the structural drag imposed on every alert by fragmented systems, opaque vendor scoring, and manual evidence assembly. Until that drag is removed, AML programs, however well-funded, remain stuck in backlog-clearing mode.
The typical AML investigation cycle today is manual and error-prone. Analysts spend three to six hours per case extracting and correlating data across 10 or more siloed systems, including: Know Your Customer (KYC), transaction monitoring, sanctions screening, case management, adverse media, beneficial ownership, internal CRM, branch logs, and regulatory knowledge bases — stitched together in spreadsheets and Word templates. The majority of that time is spent on false positives: PwC estimates that 90 to 95 percent of all alerts generated by transaction-monitoring systems are non-actionable, yet each one consumes the same investigative effort as a true positive because nothing connects the evidence automatically. First-generation rules-based monitoring is increasingly outpaced by modern AI-driven fraud techniques.
The drag shows up in four places:
The cumulative effect is a backlog that grows faster than headcount can clear it. In the PwC EMEA AML Survey 2024, 44% of financial institutions cite the escalation of financial-crime regulations as the single most pressing factor complicating compliance operations — and the next decade's typologies (real-time payments, embedded finance, crypto-fiat bridges, synthetic identity at scale) will only widen the gap.
To move from backlog-clearing to investigation, AML teams need a platform that does not merely store alerts but reasons over them and does so under the governance posture a regulator expects to see. The Databricks Data Intelligence Platform brings transaction monitoring, KYC, sanctions screening, regulatory knowledge, and AI agents together under Unity Catalog governance, with full lineage from raw transaction to filed SAR. Each component is composable rather than all-or-nothing: institutions can adopt the full stack end-to-end or layer individual pieces into existing workflows which is particularly useful for teams just beginning to modernize. Six capabilities distinguish this approach from incumbent AML stacks:
Unity Catalog consolidates 10+ siloed systems into a single, governed lakehouse. Core banking, transaction-monitoring streams, KYC profiles, sanctions hits, case history, and the institution's library of AML policy documents are ingested via Lakeflow Connect into a Bronze → Silver → Gold medallion architecture, with Delta-enforced data quality, column masking for customer PII, and row-level security tied to team and role. Every downstream artifact, the risk score, the agent's evidence chain, the SAR report, is lineage-tracked back to its source row and ingestion timestamp. When the examiner asks what triggered the alert, what evidence supported the filing, or how the institution handled structurally similar cases, the answer is a reproducible query rather than an analyst's recollection. Governance, lineage, and quality enforcement are properties of the platform, not an overlay.
Static rules engines are augmented, not replaced. The Databricks Data Intelligence Platform gives data science and financial-crime teams the foundation to develop, train, and serve state-of-the-art ML models tailored to the institution's own transaction history, customer base, and risk profile — feeding richer signals into both the alert queue and the in-investigation context. Models are registered in MLflow with champion/challenger aliases and full experiment tracking; Model Serving exposes the active model; Lakehouse Monitoring observes drift and performance in production; and inference tables capture analyst feedback that feeds challenger retraining. As challengers prove superior, teams promote them through MLflow's lifecycle management. Every alert can surface an explanation of the business rules and ML signals that triggered it, so the analyst opens a case already knowing why it landed in the queue. The result is a 75% reduction in false positives reaching the analyst queue — without rip-and-replace of the underlying transaction-monitoring rules engine.
The core of the modernization is a multi-agent chat assistant that orchestrates a fleet of specialized sub-agents during an investigation, built on Agent Bricks. Rather than logging into multiple systems to manually cross-correlate data, the analyst works from a single investigation page that surfaces past diligence notes, case notes, prior SAR filings, transaction patterns, and entity relationships in a single view. The agent fleet scans across the full network of available data and returns a knowledgeable recommendation on how to handle the case, with the human firmly in the loop for the final decision: escalate to a specialist team, dismiss as a false positive, or proceed to SAR filing. The end-to-end effect: an investigation that previously required three to six hours of manual work compresses to minutes of agent-augmented review.

When the analyst proceeds to SAR filing, the same agent fleet pre-populates the contextual metadata gathered during the investigation and drafts a custom overview and narrative for the report. The analyst fact-checks, customizes, and generates the PDF; the AI structures the document to follow the format specifications required by the institution before submission. Filed reports are pushed to the backend with a fully traceable record from an auditability perspective. SAR report building, which traditionally took hours, completes in minutes. Furthermore, this automatically closes the loop and instantly surfaces the filing as additional context and evidence for cases that are actively being analyzed in parallel by the rest of the AML team.

A graph layer, surfaced through interactive visualizations in the analyst workbench, lets the analyst move from the investigation page into a full graph view, ask natural-language questions of the graph itself, or jump into any individual entity to explore counterparty relationships. This uncovers the hidden network patterns that rule-based systems miss: shell companies, layering structures, and circular fund flows.

AML leadership lands on an executive view that surfaces case-volume KPIs, hours spent, and alerts past due; trend lines for detection and aging; a process-flow visualization from detection through team assignment to resolution; and breakouts by scenario and criticality. A Team Performance view drills into incident throughput, due-date pressure, and average turnaround by detection type and team — making it straightforward to identify bottlenecks in the process and opportunities to rebalance the team to meet critical deadlines. Natural-language chat over the same governed data allows self-service deep dives into trends without waiting on an analytics team: Genie lets AML leaders ask, "Which advisor relationships have triggered the most structuring alerts in the last quarter, and what is the false-positive rate by team?" and receive an audit-ready answer in seconds.

AML teams no longer have to choose between analyst productivity and regulatory defensibility. A governed Data Intelligence Platform, one where alerts, evidence, agents, and audit trails live in the same lineage-tracked environment, delivers both. The legacy posture of "more analysts, more vendors, more spreadsheets" is no longer competitive against institutions that have unified their compliance data and let AI agents carry the multi-source investigation load. The shift is not a future-state aspiration; it is an operational decision available today.
The solution is composed of five capabilities available on the Databricks Data Intelligence Platform:
The five layers can be deployed independently or as a complete stack. A bank already running its own transaction-monitoring engine can adopt only the scoring or reasoning layers to add ML risk scoring and AI-augmented investigation on top of existing alerts; a bank with mature case management but fragmented data can start with the ingestion and governance layer to consolidate sources first. Because every component shares the same Data Intelligence Platform and Unity Catalog governance, partial deployments accumulate toward the full architecture without re-platforming.

The five layers can be deployed independently or as a complete stack. A bank already running its own transaction-monitoring engine can adopt only the scoring or reasoning layers to add ML risk scoring and AI-augmented investigation on top of existing alerts; a bank with mature case management but fragmented data can start with the ingestion and governance layer to consolidate sources first. Because every component shares the same Data Intelligence Platform and Unity Catalog governance, partial deployments accumulate toward the full architecture without re-platforming.
▸ Deploy the solution in your workspace
▸ Talk to us: Reach out to your Databricks account team to integrate this with your existing AML workflow today!
Subscribe to our blog and get the latest posts delivered to your inbox.