Unified Batch and Streaming Source and Sink:
A table in Delta Lake is both a batch table, as well as a streaming source and sink. Streaming data ingest, batch historic backfill, and interactive queries all just work out of the box.
Schema Enforcement:
Delta Lake provides the ability to specify your schema and enforce it. This helps ensure that the data types are correct and required columns are present, preventing bad data from causing data corruption.
Big data is continuously changing. Delta Lake enables you to make changes to a table schema that can be applied automatically, without the need for cumbersome DDL.
Developers can use Delta Lake with their existing data pipelines with minimal change as it is fully compatible with Spark, the commonly used big data processing engine.
Instead of parquet…
dataframe
.write
.format("parquet")
.save("/data")
…simply say delta
dataframe
.write
.format("delta")
.save("/data")
To add your organization here, email our user list at [email protected].
Communicate with fellow Delta users and Delta engineers, ask questions and share tips. Join our Slack channel.