Let's Save Tons of Money With Cloud-Native Data Ingestion!
Overview
Experience | In Person |
---|---|
Type | Breakout |
Track | Data Lakehouse Architecture and Implementation |
Industry | Enterprise Technology, Media and Entertainment |
Technologies | Delta Lake, Apache Iceberg |
Skill Level | Beginner |
Delta Lake is a fantastic technology for quickly querying massive data sets, but first you need those massive data sets! In this session we will dive into the cloud-native architecture Scribd has adopted to ingest data from AWS Aurora, SQS, Kinesis Data Firehose and more. By using off-the-shelf open source tools like kafka-delta-ingest, oxbow and Airbyte, Scribd has redefined its ingestion architecture to be more event-driven, reliable, and most importantly: cheaper. No jobs needed!
Attendees will learn how to use third-party tools in concert with a Databricks and Unity Catalog environment to provide a highly efficient and available data platform. This architecture will be presented in the context of AWS but can be adapted for Azure, Google Cloud Platform or even on-premise environments.
Session Speakers
Tyler Croy
/Valued Employee
Scribd