For years, Apache Spark Structured Streaming has powered some of the world’s most demanding streaming workloads. However, for ultra-low latency use cases, teams needed to maintain separate, specialized engines — most commonly Apache Flink, alongside Spark, duplicating codebases, governance models, and operational overhead. Now, Databricks removes this burden for customers.
Today, we are excited to announce the General Availability of Real-Time Mode (RTM) in Spark Structured Streaming, bringing millisecond-level latency to the Spark APIs you already use. Be it detecting fraud in real-time, or generating fresh, real-time context to steer your AI agents, you can now use Spark to power all of these use cases.
RTM has already been adopted by teams at industry-leading organizations across financial services, e-commerce, media, and ad tech to power fraud detection, live personalization, ML feature computation, and ad attribution.
Coinbase, one of the world’s leading cryptocurrency exchanges, uses RTM to scale their high-frequency risk management and fraud detection engines—processing massive volumes of blockchain and exchange events with the sub-100ms latency necessary to secure millions of digital asset transactions.
By leveraging Real-Time Mode in Spark Structured Streaming, we’ve achieved an 80%+ reduction in end-to-end latencies, hitting sub-100ms P99s, and streamlining our real-time ML strategy at massive scale. This performance allows us to compute over 250 ML features all powered by a unified Spark engine.”—Daniel Zhou, Senior Staff Machine Learning Platform Engineer, Coinbase
DraftKings, one of North America's largest sportsbook and fantasy sports platforms, uses real-time mode to power feature computation for their fraud detection models — processing high-throughput betting event streams with the latency and reliability required for real-money wagering decisions.
In live sports betting, fraud detection demands extreme velocity. The introduction of Real-Time Mode together with the transformWithState API in Spark Structured Streaming has been a game changer for us. We achieved substantial improvements in both latency and pipeline design, and for the first time, built unified feature pipelines for ML training and online inference, achieving ultra-low latencies that were simply not possible earlier.”—Maria Marinova, Sr. Lead Software Engineer, DraftKings
MakeMyTrip, one of India’s leading online travel platform for hotels, flights, and experiences, adopted Real-Time Mode to power personalized search experiences. RTM processed high-volume traveler searches to deliver real-time recommendations.
In travel search, every millisecond counts. By leveraging Spark Real-Time Mode (RTM), we delivered personalized experiences with sub-50ms P50 latencies, driving a 7% uplift in click-through rates. RTM has also transformed our data operations, enabling a unified architecture where Spark handles everything from high-throughput ETL to ultra-low-latency pipelines. As we move into the era of AI agents, steering them effectively requires building real-time context from data streams. We are experimenting with Spark RTM to supply our agents with the richest, most recent context necessary to take the best possible decisions.” —Aditya Kumar, Associate Director of Engineering, MakeMyTrip
RTM can support any workload that benefits from turning data into decisions in milliseconds. Some example use cases include:
RTM is an evolution of the Spark Structured Streaming engine that enables it to achieve sub-second performance in benchmarking demanding feature engineering customer workloads.
Structured Streaming’s default microbatch mode (MBM) is like an airport shuttle bus that waits for a certain number of passengers to board before departing. On the other hand, RTM operates like a high-speed moving walkway, eliminating the limitation to wait for the shuttle bus to fill up. RTM processes each event as it arrives, providing end-to-end millisecond latency without leaving the Spark ecosystem.

From seconds to milliseconds: RTM transforms the Spark engine by replacing periodic batching with a continuous data flow, eliminating the latency bottlenecks of traditional ETL.
RTM’s performance gains come from three key architectural innovations:
Together, they transform Spark into a high-performance, low-latency engine capable of handling the most demanding operational use cases.
In order to validate the performance of Spark RTM, we benchmarked the performance against a popular specialized engine, Apache Flink based on actual customer workloads performing feature computation. These feature computation patterns are representative of most low-latency ETL use cases, such as fraud detection, personalization, and operational analytics.When comparing Spark RTM with Flink, the results demonstrate that Spark's evolved architecture provides a latency profile comparable to specialized streaming frameworks. For more information, on the data sets and queries referenced, see this GitHub repository.

One engine, up to 92% faster: RTM outpaces specialized engines like Flink, proving that millisecond-level operational analytics no longer requires a separate streaming engine. Source: Internal benchmarks based on customer feature computation patterns. Full queries available on GitHub.
While raw speed matters, Spark RTM’s greatest advantage over engines like Flink is the simplicity it offers builders. It allows teams to use the same Spark API for both batch training and real-time inference, effectively eliminating "logic drift" and codebase duplication. Spark RTM enables seamless scalability, where a single-line code change can shift a pipeline from hourly batches to sub-second streaming without manual infrastructure tuning. Ultimately, by reducing operational complexity and the need for multiple specialized systems, teams can develop and deploy real-time applications significantly faster with Spark RTM.
Getting up and running with RTM is straightforward. If you’re already using Structured Streaming, you can enable it with a single configuration update - no rewrites required.
RTM is currently available on Classic compute, across both Dedicated and Standard access modes. RTM is supported on Databricks Runtime (DBR) 16.4 and above; however,we recommend DBR 18.1 for the latest features and optimizations. During cluster creation, add the following Spark configuration:
Since launching in Public Preview in August 2025, Databricks has continued to expand RTM’s capabilities, based on customer feedback.
Here is what's new with this GA release:
RTM extends Apache Spark Structured Streaming into a new class of workloads — operational, latency-sensitive applications that demand immediate response to streaming data. By bringing sub-second latency to the Spark APIs your team already uses, it eliminates the need to operate a separate specialized engine for your most time-critical pipelines Whether you're building fraud detection pipelines, personalization engines, or ML feature computation systems, real-time mode gives you the latency your application demands with the simplicity and ecosystem breadth of Spark.
Check out the following resources to get started with RTM today:
