Sponsored By: Dagster Labs | Dev/Stage/Prod is the Wrong Pattern for Data Pipelines
OVERVIEW
EXPERIENCE | In Person |
---|---|
TYPE | Lightning Talk |
TRACK | Data Engineering and Streaming |
INDUSTRY | Enterprise Technology |
TECHNOLOGIES | Orchestration |
SKILL LEVEL | Intermediate |
DURATION | 20 min |
Fast testing and tight feedback loops have historically been the foundation of highly productive development workflows in traditional software engineering. Replicating that productivity in data engineering requires new approaches, as the developing data pipelines rely on having access to realistic data to flow through those pipelines.
In this talk, Ryan and Nick will discuss how Enigma leveraged Databricks and Dagster’s branch deployments to build a highly productive workflow for developing data pipelines on production data safely. Developers develop data pipelines in feature branches, push them to a branch deployment, where they run on top cloud infrastructure and branched storage (such as LakeFS or Delta Tables). This approach results in better productivity, more trusted data, and lower costs.
SESSION SPEAKERS
Ryan Green
/CTO
Enigma