HomepageData + AI Summit 2022 Logo
Watch on demand

Mapping Data Quality Concerns to Data Lake Zones

On Demand


  • Session


  • In-Person


  • Data Security and Governance


  • Intermediate


  • Moscone South | Upper Mezzanine | 152


  • 35 min
Download session slides


A common pattern in Data Lake and Lakehouse design is structuring data into zones, with Bronze, Silver and Gold being typical labels. Each zone is suitable for different workloads and different consumers: for instance, machine learning algorithms typically process against Bronze or Silver, while analytic dashboards often query Gold. This prompts the question: which layer is best suited for applying data quality rules and actions? Our answer: all of them.

In this session, we’ll expand on our answer by describing the purposes of the different zones, and mapping the categories of data quality relevant for each by assessing its qualitative requirements. We’ll describe Data Enrichment: the practice of making observed anomalies available as inputs to downstream data pipelines, and provide recommendations for when to merely alert, when to quarantine data, when to halt pipelines, and when to apply automated corrective actions.

Session Speakers

Stewart Bryson

Co-founder & Chief Customer Officer


Das Beste des Data+AI Summits anzeigen

Watch on demand