ホームData + AI Summit 2022 のロゴ
Watch on demand

How unsupervised machine learning can scale data quality monitoring in Databricks

On Demand

Type

  • Sponsored Session

フォーマット

  • In-Person

Track

  • Sponsored Session

Difficulty

  • Intermediate

Room

  • Moscone South | Level 2 | 216

Duration

  • 35 min

概要

Technologies like Databricks Delta Lake and Databricks SQL enable enterprises to store and query their data. But existing rules and metrics approaches to monitoring the quality of this data are tedious to set up and maintain, fail to catch unexpected issues, and generate false positive alerts that lead to alert fatigue.

In this talk, Jeremy will describe a set of fully unsupervised machine learning algorithms for monitoring data quality at scale in Databricks. He will cover how the algorithms work, their strengths and weaknesses, and how they are tested and calibrated.

Participants will leave this talk with an understanding of unsupervised data quality monitoring, its strengths and weaknesses, and how to begin monitoring data using it in Databricks.

Session Speakers

Jeremy Stanley

Co-Founder & CTO

Anomalo

Data+AI サミットの様子をご覧いただけます

Watch on demand