Session

From Spaghetti Bowl Pipeline to DLT Efficiency

Overview

ExperienceIn Person
TypeLightning Talk
TrackData Engineering and Streaming
IndustryHealth and Life Sciences
TechnologiesDatabricks Workflows, DLT, Unity Catalog
Skill LevelBeginner
Duration20 min

In today's data-driven world, the ability to efficiently manage and transform data is crucial for any organization. This presentation will explore the process of converting a complex and messy workflow into a clean and simple DLT pipeline at a large integrated health system, Intermountain Health.

 

Alteryx is a powerful tool for data preparation and blending, but as workflows grow in complexity, they can become difficult to manage and maintain. DLT, on the other hand, offers a more democratized, streamlined and scalable approach to data engineering, leveraging the power of Apache Spark and Delta Lake.

 

We will begin by examining a typical legacy workflow, identifying common pain points such as tangled logic, performance bottlenecks and maintenance challenges. Next, we will demonstrate how to translate this workflow into a DLT pipeline, highlighting key steps such as data transformation, validation and delivery.

Session Speakers

Peter Jones

/Analytics Engineer
Intermountain Healthcare