Skip to main content

Delta Live Tables

Reliable data engineering made easy

Delta Live Tables Thumbnail

Delta Live Tables (DLT) makes it easy to build and manage reliable batch and streaming data pipelines that deliver high-quality data on the Databricks Lakehouse Platform. DLT helps data engineering teams simplify ETL development and management with declarative pipeline development, automatic data testing, and deep visibility for monitoring and recovery.

Easily build and maintain data pipelines

Easily build and maintain data pipelines

With Delta Live Tables, easily define end-to-end data pipelines in SQL or Python. Simply specify the data source, the transformation logic, and the destination state of the data — instead of manually stitching together siloed data processing jobs. Automatically maintain all data dependencies across the pipeline and reuse ETL pipelines with environment-independent data management. Run in batch or streaming mode and specify incremental or complete computation for each table.

Automatic data quality testing

Automatic data quality testing

Delta Live Tables helps to ensure accurate and useful BI, data science and machine learning with high-quality data for downstream users. Prevent bad data from flowing into tables through validation and integrity checks and avoid data quality errors with predefined error policies (fail, drop, alert or quarantine data). In addition, you can monitor data quality trends over time to get insight into how your data is evolving and where changes may be necessary.

Delta Live Tables Autoscaling

Cost-effective streaming through efficient compute autoscaling

Delta Live Tables Enhanced Autoscaling is designed to handle streaming workloads which are spiky and unpredictable. It optimizes cluster utilization by only scaling up to the necessary number of nodes while maintaining end-to-end SLAs, and gracefully shuts down nodes when utilization is low to avoid unnecessary spend.

pipeline monitoring

Deep visibility for pipeline monitoring and observability

Gain deep visibility into pipeline operations with tools to visually track operational stats and data lineage. Reduce downtime with automatic error handling and easy replay. Speed up maintenance with single-click deployment and upgrades.

Shell trusts Delta Live Tables

“At Shell, we are aggregating all our sensor data into an integrated data store. Delta Live Tables has helped our teams save time and effort in managing data at the multi-trillion-record scale and continuously improving our AI engineering capability. With this capability augmenting the existing lakehouse architecture, Databricks is disrupting the ETL and data warehouse markets, which is important for companies like ours. We are excited to continue to work with Databricks as an innovation partner.”

– Dan Jeavons, General Manager – Data Science, Shell

Use cases

Unify batch and streaming ETL

Unify batch and streaming ETL

Build and run both batch and streaming pipelines in one place with controllable and automated refresh settings, saving time and reducing operational complexity. For data streaming on the lakehouse, streaming ETL with Delta Live Tables is the best place to start.

Simplify data pipeline deployment and testing

Simplify data pipeline deployment and testing

With different copies of data isolated and updated through a single code base, data lineage information can be captured and used to keep data fresh anywhere. So the same set of query definitions can be run in development, staging and production.

Meet regulatory requirements

Meet regulatory requirements

Capture all information about your table for analysis and auditing automatically with the event log. Understand how data flows through your organization and meet compliance requirements.

Resources

Whitepapers

Power of DLT

Webinars

Tackle Data Transformation Challenges

Demos

ETL Pipelines