Elevating Data Quality Standards With Databricks DQX
Overview
Experience | In Person |
---|---|
Type | Breakout |
Track | Data and AI Governance |
Industry | Energy and Utilities, Manufacturing, Financial Services |
Technologies | Apache Spark, Delta Lake, Unity Catalog |
Skill Level | Beginner |
Duration | 40 min |
Join us for an introductory session on Databricks DQX, a Python-based framework designed to validate the quality of PySpark DataFrames. Discover how DQX can empower you to proactively tackle data quality challenges, enhance pipeline reliability and make more informed business decisions with confidence.
Traditional data quality tools often fall short by providing limited, actionable insights, relying heavily on post-factum monitoring, and being restricted to batch processing. DQX overcomes these limitations by enabling real-time quality checks at the point of data entry, supporting both batch and streaming data validation and delivering granular insights at the row and column level.
If you’re seeking a simple yet powerful data quality framework that integrates seamlessly with Databricks, this session is for you.
Session Speakers
Marcin Wojtyczka
/Sr. Resident Solutions Architect
Databricks
Neha Milak
/RSA
Databricks