Session

Delta and Databricks as a Performant Exabyte-Scale Application Backend

Overview

Tuesday

June 10

5:20 pm

ExperienceIn Person
TypeLightning Talk
TrackData Lakehouse Architecture and Implementation
IndustryEnterprise Technology, Financial Services
TechnologiesApache Spark, Delta Lake
Skill LevelIntermediate
Duration20 min

The Delta Lake architecture promises to provide a single, highly functional, and high-scale copy of data that can be leveraged by a variety of tools to satisfy a broad range of use cases. To date, most use cases have focused on interactive data warehousing, ETL, model training, and streaming. Real-time access is generally delegated to costly and sometimes difficult-to-scale NoSQL, indexed storage, and domain-specific specialty solutions, which provide limited functionality compared to Spark on Delta Lake.

 

In this session, we will explore the Delta data-skipping and optimization model and discuss how Capital One leveraged it along with Databricks photon and Spark Connect to implement a real-time web application backend. We’ll share how we built a highly-functional and performant security information and event management user experience (SIEM UX) that is cost effective.

Session Speakers

Scott Schenkein

/VP, Distinguished Engineer
Capital One Financial