Skip to main content
Page 1
>
Engineering blog

Adaptive Query Execution in Structured Streaming

In Databricks Runtime, Adaptive Query Execution (AQE) is a performance feature that continuously re-optimizes batch queries using runtime statistics during query execution. Starting...
Engineering blog

Latency goes subsecond in Apache Spark Structured Streaming

Apache Spark Structured Streaming is the leading open source stream processing platform. It is also the core technology that powers streaming on the...
Engineering blog

How Collective Health uses Delta Live Tables and Structured Streaming for Data Integration

April 13, 2023 by Mragesh Khandelwal and Mahmoud Saleh in Customers
Collective Health is not an insurance company. We're a technology company that's fundamentally making health insurance work better for everyone— starting with the...
Engineering blog

Scalable Spark Structured Streaming for REST API Destinations

March 2, 2023 by Art Rask and Jay Palaniappan in Engineering Blog
Spark Structured Streaming is the widely-used open source engine at the foundation of data streaming on the Databricks Lakehouse Platform . It can...
Engineering blog

Build Reliable and Cost Effective Streaming Data Pipelines With Delta Live Tables’ Enhanced Autoscaling

This year we announced the general availability of Delta Live Tables (DLT) , the first ETL framework to use a simple, declarative approach...
Engineering blog

Python Arbitrary Stateful Processing in Structured Streaming

October 18, 2022 by Hyukjin Kwon and Jungtaek Lim in Engineering Blog
More and more customers are using Databricks for their real-time analytics and machine learning workloads to meet the ever increasing demand of their...
Engineering blog

State Rebalancing in Structured Streaming

In light of the accelerated growth and adoption of Apache Spark Structured Streaming, Databricks announced Project Lightspeed at Data + AI Summit 2022...
Engineering blog

Using Streaming Delta Live Tables and AWS DMS for Change Data Capture From MySQL

September 29, 2022 by Neil Patel in Platform Blog
In this article we will walk you through the steps to create an end-to-end CDC pipeline with Terraform using Delta Live Tables, AWS...
Engineering blog

Databricks at Current 2022

Current 2022 , organized by Confluent, is the first-ever data streaming industry event – and it's coming up soon! No matter where you...
Engineering blog

Simplifying Streaming Data Ingestion into Delta Lake

September 12, 2022 by Sachin Patil in Engineering Blog
Most business decisions are time sensitive and require harnessing data in real time from different types of sources. Sourcing the right data at...