Apache Spark Structured Streaming has long powered mission-critical data pipelines at scale, from streaming ETL to analytics and machine learning. But as operational use cases evolved, teams began demanding something more: sub-second latencies for applications such as fraud detection, personalization, anomaly detection, real-time alerting and reporting.
Historically, meeting these ultra-low latency requirements meant introducing specialized systems alongside Spark. With the introduction of Real-Time Mode in Spark Structured Streaming, that tradeoff is no longer necessary. In this blog, we explore how Spark simplifies real-time streaming architecture for common use cases such as feature engineering, eliminates long standing operational complexity, and delivers industry-leading performance.
The ability to process and act on data in real-time is now a core requirement. Modern applications, especially AI agents, rely on a continuous stream of fresh context to function. If the underlying data is incomplete or lagging, the user experience suffers. Real-time performance is not only needed for traditional use cases such as fraud detection, but for every common interaction where a user expects precise, up-to-date responses. In this environment, latency directly impacts revenue, customer trust, and competitive advantage.
Data teams building real-time streaming applications have historically had to manage two distinct data processing stacks: Apache Spark™ for large-scale analytics and specialized systems such as Apache Flink® or Kafka Streams for sub-second, latency sensitive applications. This fragmentation requires teams to maintain duplicated codebases, manage separate governance models, and hire specialized talent to tune and maintain engine-specific infrastructure.
Launched in public preview in August 2025, Real-Time Mode (RTM) for Apache Spark Structured Streaming is designed to eliminate this friction. By fundamentally evolving the Spark execution engine, we have removed the need for a second system. This shift allows engineers to address the entire spectrum of use cases—from high-throughput ETL to low-latency real-time apps—using the same Spark API they already know. This means less time managing infrastructure, and more time to focus on the business use case.
Real-Time Mode (RTM) introduced a new optimized execution engine that enables Spark to deliver consistent sub-second latencies. To evaluate performance, we conducted a side-by-side comparison between Spark RTM and Apache Flink. The testing was based on real-time feature computation workloads we commonly see in production. These feature computation patterns are representative of most low-latency ETL use cases, such as fraud detection, personalization, and operational analytics.
We evaluated three common feature patterns:
The results demonstrate that Spark's evolved architecture provides a latency profile comparable to specialized streaming frameworks.

This performance is enabled by three key technical innovations in RTM:
Together, these transform Spark into a high-performance, low-latency engine capable of handling the most demanding operational use cases.
While raw speed is essential, the true value of real-time mode is in its ability to eliminate the operational complexity that typically stalls building ultra-low latency pipelines. Spark RTM makes your architecture significantly simpler through three core advantages. To make this concrete, we describe these in the context of real time machine learning applications.
Minimize "logic drift" between training and inference: Real-time ML, like fraud detection, requires a seamless handoff between high-throughput batching (for model training) and low-latency streaming (for live inference). Spark is the preferred choice for data scientists for model training, and forcing a switch from Spark to Flink for inference would create a business logic gap. You end up with one version of the logic in Spark for training and a completely different codebase in Flink for production. This replication of business logic can be error prone and leads to logic drift, where your model is trained on one reality but scores on another. With Spark RTM, your transformation code remains identical, enabling you to productionize features faster and with great accuracy.
Freshness on-demand with a single-line code change: Business requirements are rarely static. A feature pipeline that starts with a 1-minute SLA today might require sub-second latency tomorrow as the model's freshness needs evolve. Conversely, for many use cases, "going slower" (e.g., daily or hourly batches) is significantly more cost-effective when immediate freshness isn't required. Spark provides the room to grow and scale alongside your product. It enables you to easily pivot your feature engineering strategy with a single-line code change. For instance, you can set your trigger to AvailableNow to run a pipeline on a daily or hourly schedule. When business needs shift, you can transition to continuous, ultra-low-latency streaming simply by switching to real-time mode: .trigger(RealTimeTrigger.apply()). In contrast, achieving this in Flink is a manual process. It often requires you to tune parallelism and orchestrate the shutdown and restart of compute resources just to match a new processing frequency.
Accelerate development: RTM is built on the same Spark API that your team already knows. This eliminates the friction of maintaining multiple systems, allowing you to move faster by building and scaling real-time applications within a single, consistent environment.
Early adopters are using RTM to power a range of low-latency applications, across industries.
Fraud Detection: A leading digital asset platform computes dynamic risk features such as velocity checks and aggregate spend patterns from Kafka streams, updating their online feature store in under 200 milliseconds to block fraudulent transactions at the point of sale.
Personalized Experiences: An e-commerce platform computes real-time intent features based on a user's current session, allowing models to refresh recommendations the moment a user interacts with a product.
IoT monitoring: A transport and logistics company ingests live telemetry to drive anomaly detection, moving from reactive to proactive decision-making in milliseconds.
DraftKings, one of North America’s largest sportsbook and fantasy sports services, uses RTM to power feature computation for their fraud detection models.
“In live sports betting, fraud detection demands extreme velocity. The introduction of Real-Time Mode together with the transformWithState API in Spark Structured Streaming has been a game changer for us. We achieved substantial improvements in both latency and pipeline design, and for the first time, built unified feature pipelines for ML training and online inference, achieving ultra-low latencies that were simply not possible earlier.” —Maria Marinova, Sr. Lead Software Engineer, DraftKings
The era of choosing between "easy" and "fast" is over. Why manage two engines, two security models, and two sets of specialized skills when one engine now does it all? RTM delivers the sub-second speed your real-time applications demand, with the architectural simplicity your team deserves. By removing the "operating tax," you can finally focus on building value rather than managing infrastructure.
Ready to eliminate the complexity of your real-time stack?
Product
November 21, 2024/3 min read

