Skip to main content
Product

Announcing Lakebase Change Data Feed (CDF)

Opening the OLTP database to other engines

by Pranav Aurora, Cheng Chen and Hristo Stoyanov

  • Lakebase Change Data Feed (Public Preview) eliminates pipeline sprawl from operational databases. Turn on CDF once per Lakebase project to expose every table's changes through Unity Catalog Managed Tables for direct read access by any engine, model, or agent.
  • Native CDC governed end-to-end without sidecar infrastructure: no database connectors, replication state monitoring, or separate extraction jobs; downstream consumers like SDP streaming pipelines, DBSQL materialized views, and Agent Bricks embeddings all subscribe to the same isolated feed without impacting the primary workload.
  • Operational data now functions as the native Bronze layer in the medallion architecture. Lakebase Synced Tables already serve Gold data to applications; Lakebase CDF closes the loop with full Unity Catalog governance and lineage across the data lifecycle.

Moving data from your operational database has traditionally meant setting up and monitoring a pipeline for each source to each destination. For most teams, this is a brittle, ungoverned, and O(n) human effort.

Today, we’re changing this approach. Available now in Public Preview, Lakebase features a Change Data Feed (CDF) that is stored and governed in Unity Catalog Managed Tables. Enable the feed once and allow all engines, models, and agents to read from it directly. 

set up Lakebase CDF in just a few clicks.

Why is landing operational data into the lake still so hard?

While Lakeflow Connect has made ingesting data into the Lakehouse trivial, getting data out of the OLTP database is remains a manual and high-friction process. Extracting Change Data Capture (CDC) forces teams to configure database connectors, babysit replication states, mitigate performance impacts, and track errors through disconnected tools. This model breaks down in agent-first development that relies on rapid data branching. Maintaining complex, ungoverned extraction pipelines for every new branch to every destination is unsustainable.

We solved this in the Lakehouse. Now we’re bringing it to Lakebase.

The Lakehouse eliminated extraction pipelines for analytics by storing data once in open formats (Apache Iceberg™, Delta Lake). It established Change Data Feed (CDF) as the standard for downstream replication, powering ETL, streaming workflows, and audit logs.

Lakebase CDF syncs row level changes

You can now set up that CDF natively on Lakebase. It takes less than a minute to enable, applying to all tables within a project. From this single feed, you can build streaming pipelines with SDP, generate materialized views with DBSQL, or compute and store embeddings with Agent Bricks. Every downstream consumer subscribes to the exact same feed, completely isolated from your primary operational workload.

Operational databases belong in the medallion architecture

With Lakebase, your operational data is no longer isolated from the Lakehouse. Lakebase already offers Synced Tables, establishing the pattern of serving Gold datasets directly to applications. Lakebase CDF completes the architecture. Your operational database is now your native Bronze layer, eliminating the need for separate pipelines or extraction jobs to land data into the Lakehouse. Instead, you get full  governance and lineage across the data life cycle through Unity Catalog.

This is just the start. We are bringing the openness you love from the Lakehouse directly to Lakebase. Stay tuned for Data and AI Summit, and join our breakout session on this architecture.

Get the latest posts in your inbox

Subscribe to our blog and get the latest posts delivered to your inbox.