Serving Near Real-Time Features at Scale
- Moscone South | Upper Mezzanine | 155
- 35 min
This presentation will first introduce the use case, which generates the price adjustments based on the network effect, and the corresponding model relies on the 108 near real-time features computed by Flink pipelines with the raw demand and supply events. Here is the simplified computation logic:
-The pipelines need to process the raw real-time events at the rate of 300k/s including both demand and supply
-Each event needs to be computed on the geospatial, temporal and other dimensions
-Each event contributes to the computation on the original hexagon and the 1K+ neighbours due to the fan-out effect of Kring smooth
-Each event contributions to the aggregation on multiple window sizes up to 32 minutes, sliding by 1 minute, or 63 windows in total
Next the presentation will briefly go through the DAG of the Flink pipeline before optimization and the issues we faced: the pipeline could not run stably due to OOM and backpressure. The presentation will discuss how to optimize a streaming pipeline with the generic performance tuning framework, which focuses on three areas: Network, CPU and Memory, and five domains: Parallelism, Partition, Remote Call, Algorithm and Garbage Collector. The presentation will also show some example techniques being applied onto the pipelines by following the performance tuning framework.
Then the presentation will discuss one particular optimization technique: Customized Sliding Window.
Powering machine learning models with near real-time features can be quite challenging, due to computation logic complexity, write throughput, serving SLA, etc. In this talk, we have introduced some of the problems that we faced and our solutions to them, in the hope of aiding our peers in similar use cases.