Delta Live Tables
Reliable data engineering made easy
Delta Live Tables (DLT) makes it easy to build and manage reliable batch and streaming data pipelines that deliver high-quality data on the Databricks Lakehouse Platform. DLT helps data engineering teams simplify ETL development and management with declarative pipeline development, automatic data testing, and deep visibility for monitoring and recovery.
Easily build and maintain data pipelines
With Delta Live Tables, easily define end-to-end data pipelines in SQL or Python. Simply specify the data source, the transformation logic, and the destination state of the data — instead of manually stitching together siloed data processing jobs. Automatically maintain all data dependencies across the pipeline and reuse ETL pipelines with environment-independent data management. Run in batch or streaming mode and specify incremental or complete computation for each table.
Automatic data quality testing
Delta Live Tables helps to ensure accurate and useful BI, data science and machine learning with high-quality data for downstream users. Prevent bad data from flowing into tables through validation and integrity checks and avoid data quality errors with predefined error policies (fail, drop, alert or quarantine data). In addition, you can monitor data quality trends over time to get insight into how your data is evolving and where changes may be necessary.
Cost-effective streaming through efficient compute autoscaling
Delta Live Tables Enhanced Autoscaling is designed to handle streaming workloads which are spiky and unpredictable. It optimizes cluster utilization by only scaling up to the necessary number of nodes while maintaining end-to-end SLAs, and gracefully shuts down nodes when utilization is low to avoid unnecessary spend.
Deep visibility for pipeline monitoring and observability
Gain deep visibility into pipeline operations with tools to visually track operational stats and data lineage. Reduce downtime with automatic error handling and easy replay. Speed up maintenance with single-click deployment and upgrades.
Shell trusts Delta Live Tables
“At Shell, we are aggregating all our sensor data into an integrated data store. Delta Live Tables has helped our teams save time and effort in managing data at the multi-trillion-record scale and continuously improving our AI engineering capability. With this capability augmenting the existing lakehouse architecture, Databricks is disrupting the ETL and data warehouse markets, which is important for companies like ours. We are excited to continue to work with Databricks as an innovation partner.”
Unify batch and streaming ETL
Build and run both batch and streaming pipelines in one place with controllable and automated refresh settings, saving time and reducing operational complexity. For data streaming on the lakehouse, streaming ETL with Delta Live Tables is the best place to start.
Simplify data pipeline deployment and testing
With different copies of data isolated and updated through a single code base, data lineage information can be captured and used to keep data fresh anywhere. So the same set of query definitions can be run in development, staging and production.
Meet regulatory requirements
Capture all information about your table for analysis and auditing automatically with the event log. Understand how data flows through your organization and meet compliance requirements.