HomepageData + AI Summit 2022 Logo
Watch on demand

Enabling BI in a Lakehouse Environment: How Spark and Delta Can Help With Automating a DWH Development

On Demand

Type

  • Session

Format

  • Hybrid

Track

  • Data Lakes, Data Warehouses and Data Lakehouses

Branche

  • Finanzdienstleistungen

Difficulty

  • Intermediate

Room

  • Moscone South | Upper Mezzanine | 160

Duration

  • 35 min
Download session slides

Überblick

The traditional enterprise data warehouses typically struggle when it comes to handling large volumes of data and traffic, particularly when it comes to semi-structured and unstructured data. In contrast, data lakes manage to overcome such issues and have nowadays become the central hub for storing data. In this session we further outline how we can enable BI Kimball data modelling development in a Lakehouse environment.

In this session, we will present why & how we built a Spark-based framework to modernize and automate data warehouse development while having the data lake as central storage, assuring high data quality and scalability. The framework has proven to work & was already implemented in over 15 enterprise data warehouses across various companies in Europe.

In depth we will present in our session how one can tackle in Spark & with Delta Lake data warehouse principles like surrogate, foreign and business keys, SCD type 1 and 2 dimensions etc., while being able to address the shortcomings of the traditional data warehouses. Additionally, we will share our experiences on how such a unified (proper) data modelling framework can help bridge BI with modern day use cases, such as machine learning and real time analytics.

This session is a perfect fit for the Data & AI conference given the underlying technology which are Spark & Delta Lake. In our session we welcome the opportunity to share our original challenges, the steps taken, the build framework as well as the technical hurdles we faced along the way.

Session Speakers

Yoshi Coppens

Data Engineer

element61

Ivana Pejeva

Cloud Solution Architect

Microsoft

Das Beste des Data+AI Summits anzeigen

Watch on demand