Session

Kafka Forwarder: Simplifying Kafka Consumption at OpenAI

Overview

ExperienceIn Person
TypeBreakout
TrackData Lakehouse Architecture and Implementation
IndustryEnterprise Technology
TechnologiesApache Spark, Delta Lake, Databricks SQL
Skill LevelIntermediate
Duration40 min

At OpenAI, Kafka fuels real-time data streaming at massive scale, but traditional consumers struggle under the burden of partition management, offset tracking, error handling, retries, Dead Letter Queues (DLQ), and dynamic scaling — all while racing to maintain ultra-high throughput. As deployments scale, complexity multiplies.

 

Enter Kafka Forwarder — a game-changing Kafka Consumer Proxy that flips the script on traditional Kafka consumption. By offloading client-side complexity and pushing messages to consumers, it ensures at-least-once delivery, automated retries, and seamless DLQ management via Databricks. The result? Scalable, reliable and effortless Kafka consumption that lets teams focus on what truly matters.

 

Curious how OpenAI simplified self-service, high-scale Kafka consumption? Join us as we walk through the motivation, architecture and challenges behind Kafka Forwarder, and share how we structured the pipeline to seamlessly route DLQ data into Databricks for analysis.

Session Speakers

Jigar Bhati

/Member of Technical Staff
Open AI