SESSION

Power Up Your Lakehouse with Git Semantics & Delta Lake

OVERVIEW

EXPERIENCEIn Person
TYPEBreakout
TRACKData Lakehouse Architecture
INDUSTRYEnterprise Technology
TECHNOLOGIESDelta Lake, Developer Experience
SKILL LEVELIntermediate
DURATION40 min

The lakehouse architecture has become the backbone of big data operations today, however it doesn’t come without challenges. The challenge of data versioning (AKA time travel) presents itself in numerous areas of DataOps, including the ability to write/audit/publish to test and verify changes before releases, rolling back changes to a consistent and good known state, creating reproducible workloads that encapsulate multiple tables (and code!), and building economical, ad hoc dev/test environments with zero data copies. Luckily, data engineering has made quite a bit of progress, and there are great OSS tools that can help overcome these challenges. In this talk, we’ll present how Delta Lake and lakeFS together can help apply git-like semantics for improved time travel for lakehouses. Delta Lake delivers a linear history through table snapshots, while lakeFS adds a layer of branching and merging capabilities, resulting in improved data quality and economics for your operations.

SESSION SPEAKERS

Oz Katz

/CTO & Co-creator of lakeFS
lakeFS