Skip to main content
Page 1
Engineering blog

Latency goes subsecond in Apache Spark Structured Streaming

Apache Spark Structured Streaming is the leading open source stream processing platform. It is also the core technology that powers streaming on the...
Engineering blog

Automatically Evolve Your Nested Column Schema, Stream From a Delta Table Version, and Check Your Constraints

We recently announced the release of Delta Lake 0.8.0 , which introduces schema evolution and performance improvements in merge and operational metrics in...
Engineering blog

Easily Clone your Delta Lake for Testing, Sharing, and ML Reproducibility

September 15, 2020 by Burak Yavuz and Pranav Anand in Engineering Blog
Introducing Clones An efficient way to make copies of large datasets for testing, sharing and reproducing ML experiments We are excited to introduce...