LAKEFLOW DECLARATIVE PIPELINES

Reliable data pipelines made easy

Simplify batch and streaming ETL with automated reliability and built-in data quality.

TOP TEAMS SUCCEED WITH INTELLIGENT DATA PIPELINES

BENEFITS

Data pipeline best practices, codified

Simply declare the data transformations you need — let Lakeflow Declarative Pipelines handle the rest.

Efficient ingestion

Building production-ready data pipelines starts with ingestion. Lakeflow Declarative Pipelines enables efficient ingestion for data engineers, Python developers, data scientists and SQL analysts. Load data from any Apache Spark™-supported source on Databricks, whether batch, streaming or CDC.

Intelligent transformation

From just a few lines of code, Lakeflow Declarative Pipelines determines the most efficient way to build and execute your batch or streaming data pipelines, automatically optimizing for cost or performance while minimizing complexity.

Automated operations

Lakeflow Declarative Pipelines simplifies pipeline development by codifying best practices out of the box, automating dependency management, scaling and recovery, data quality rules and more. With Lakeflow Declarative Pipelines, engineers can focus on delivering high-quality data rather than operating and maintaining pipeline infrastructure.

FEATURES

Built to simplify data pipelining

Building and operating data pipelines can be hard — but it doesn’t have to be. Lakeflow Declarative Pipelines is built for powerful simplicity, so you can perform robust ETL with just a few lines of code.

Leveraging Spark’s unified API for batch and stream processing, Lakeflow Declarative Pipelines allows you to easily toggle between processing modes.

Learn more

Lakeflow Declarative Pipelines makes it easy to optimize pipeline performance by declaring an entire incremental data pipeline with streaming tables and materialized views.

Learn more

Lakeflow Declarative Pipelines supports a broad ecosystem of sources and sinks. Load data from any source — including cloud storage, message buses, change data feeds, databases and enterprise apps.

Learn more

Expectations allow you to guarantee data arriving in tables meets data quality requirements and provides insights on data quality with each pipeline update.

Learn more

Develop pipelines in the IDE for Data Engineering without any context switching. See the DAG, data preview and execution insights in one UI. Develop code easily with autocomplete, in-line errors and diagnostics.

Learn more

More features

Unified governance and storage

Built on the foundational lakehouse standards of Unity Catalog and open table formats.

Learn more

Serverless compute

Up to 5x better price/performance for data ingestion and 98% cost savings for complex transformations.

Learn more

Task orchestration

Instead of manually defining a series of separate Apache Spark™ tasks, you define the transformations, and Lakeflow Declarative Pipelines ensures they are executed in the correct sequence.

Learn more

Error handling and failure recovery

Seamless recovery from errors that occur during the execution of data pipelines.

Learn more

CI/CD and version control

Easily specify configurations to isolate pipelines in developing, testing and production environments.

Learn more

Pipeline monitoring and observability

Built-in monitoring and observability features, including data lineage, update history and data quality reporting.

Learn more

Flexible refresh scheduling

Easily optimize for latency or cost depending on your pipeline’s requirements.

Learn more

USE CASES

Streamline your data pipelines

Easily ensure data integrity and consistency

Simplify change data capture with the APPLY CHANGES APIs for change data feeds and database snapshots. Lakeflow Declarative Pipelines automatically handles out-of-sequence records for SCD Type 1 and 2, simplifying the hardest parts of CDC.

Get started

Unlock powerful real-time use cases without extra tooling

Build and run your batch and streaming data pipelines in one place with controllable and automated refresh settings, saving time and reducing operational complexity. Operationalize streaming data to immediately improve the accuracy and actionability of your analytics and AI.

Get started

Seamlessly bring data engineering best practices to the world of data warehousing

With Lakeflow Declarative Pipelines, data warehouse users have the full power of declarative ETL via an accessible SQL interface. Empower your SQL analysts with low-code, infrastructure-free data pipelines, unlocking fresh data for the business with minimal setup or dependencies.

Get started

Explore Lakeflow Declarative Pipelines demos

See all demos

TECHNICAL GUIDE

Getting Started With Lakeflow Declarative Pipelines

PRODUCT TOUR

Lakeflow Declarative Pipelines Product Tour

VIDEO

Intelligent Data Engineering in the Age of AI

VIDEO

The Serverless, Real-Time Lakehouse in Action

PRICING

Usage-based pricing keeps spending in check

Only pay for the products you use at per-second granularity.

Explore pricing

Discover more

Explore other integrated, intelligent offerings on the Data Intelligence Platform.

Lakeflow Connect

Efficient data ingestion connectors from any source and native integration with the Data Intelligence Platform unlock easy access to analytics and AI, with unified governance.

Lakeflow Jobs

Easily define, manage and monitor multitask workflows for ETL, analytics and machine learning pipelines. With a wide range of supported task types, deep observability capabilities and high reliability, your data teams are empowered to better automate and orchestrate any pipeline and become more productive.

Lakehouse Storage

Unify the data in your lakehouse, across all formats and types, for all your analytics and AI workloads.

Unity Catalog

Seamlessly govern all your data assets with the industry’s only unified and open governance solution for data and AI, built into the Databricks Data Intelligence Platform.

The Data Intelligence Platform

Find out how the Databricks Data Intelligence Platform enables your data and AI workloads.

Take the next step

Explore the Lakeflow Declarative Pipelines docs

Everything you need to get started using Lakeflow Declarative Pipelines on the AWS, Microsoft Azure or Google Cloud Platform environments.

Start a free trial

Test-drive the full Databricks Platform for free.

Lakeflow Declarative Pipelines FAQ

Ready to become a data + AI company?

Take the first steps in your transformation

Try for free Contact Sales

Reliable data pipelines made easy

Data pipeline best practices, codified

Efficient ingestion

Intelligent transformation

Automated operations

Built to simplify data pipelining

More features

Unified governance and storage

Serverless compute

Task orchestration

Error handling and failure recovery

CI/CD and version control

Pipeline monitoring and observability

Flexible refresh scheduling

Streamline your data pipelines

Make sources, transformations and destinations simple

Easily ensure data integrity and consistency

Unlock powerful real-time use cases without extra tooling

Seamlessly bring data engineering best practices to the world of data warehousing

Explore Lakeflow Declarative Pipelines demos

Usage-based pricing keeps spending in check

Discover more

Lakeflow Connect

Lakeflow Jobs

Lakehouse Storage

Unity Catalog

The Data Intelligence Platform

Take the next step

Explore the Lakeflow Declarative Pipelines docs

Start a free trial

Related content

Lakeflow Declarative Pipelines FAQ