Deep Dive Into Streaming on Databricks
Overview
| Experience | In Person |
|---|---|
| Track | Data Engineering & Streaming |
| Industry | Enterprise Technology, Manufacturing, Financial Services |
| Technologies | Lakeflow |
| Skill Level | Intermediate |
A factory detects an equipment anomaly and needs the alert in seconds. A financial platform needs to process a transaction, enrich it with account history, and flag it for fraud—all before the customer closes the app. These workloads both involve streaming data, but they have very different requirements under the hood.
This session introduces the two core streaming capabilities on Databricks, Real-Time Mode and Zerobus Ingest, and how to match them to your workload.
Real-Time Mode on Apache Spark™ Structured Streaming is for when data needs to be acted upon, not just stored. With sub-second, event-driven computation and expressive stateful processing via TransformWithState, it's the right choice for operational workloads where latency directly affects business outcomes: live betting platforms, ad attribution pipelines, fraud detection, and real-time personalization.
Zerobus Ingest is a Kafka-free ingestion capability that lets applications and devices write event data directly into Delta Lake, no message broker, no compute overhead, governed delivery out of the box. It's designed to move data fast and at scale. It does not process or transform data in flight, and for ingestion-heavy workloads, that's exactly the right tradeoff.
We'll contrast both tools, walk through a workload decision framework, and show you when to use each, and how to chain them together for end-to-end real-time data flow on Databricks.
Session Speakers
Navneeth Nair
/Staff Product Manager
Databricks
Victoria Bukta
/Member of Product Staff
Databricks