Session

Elevating Data Quality Standards With Databricks DQX

Overview

ExperienceIn Person
TypeBreakout
TrackData and AI Governance
IndustryEnergy and Utilities, Manufacturing, Financial Services
TechnologiesApache Spark, Delta Lake, Unity Catalog
Skill LevelBeginner
Duration40 min

Join us for an introductory session on Databricks DQX, a Python-based framework designed to validate the quality of PySpark DataFrames. Discover how DQX can empower you to proactively tackle data quality challenges, enhance pipeline reliability and make more informed business decisions with confidence.

 

Traditional data quality tools often fall short by providing limited, actionable insights, relying heavily on post-factum monitoring, and being restricted to batch processing. DQX overcomes these limitations by enabling real-time quality checks at the point of data entry, supporting both batch and streaming data validation and delivering granular insights at the row and column level.

 

If you’re seeking a simple yet powerful data quality framework that integrates seamlessly with Databricks, this session is for you.

Session Speakers

Marcin Wojtyczka

/Sr. Resident Solutions Architect
Databricks

Neha Milak

/RSA
Databricks