HomepageData + AI Summit 2023 Logo
SAN FRANCISCO, JUNE 26-29
VIRTUAL, JUNE 28-29
  • Sessions
Watch on demand

Map Your Lakehouse Content with DiscoverX

Wednesday, June 28 @3:30 PM
Attending in person? Add to your schedule ↗

Overview

An enterprise lakehouse contains many different datasets which are related to different sources and might belong to different business units. These datasets can span across hundreds of tables, and each table has a different schema, and those schemas evolve over time. The cyber security domain is a good example where datasets come from many different source systems and land in the lakehouse. With such a complex dataset ecosystem, answers to simple questions like “Have we ever detected this IP address?” or “Which columns contain IP addresses?” can become impractical and expensive.



 



DiscoverX can automate the discovery of all columns that might contain specific patterns, (e.g., IP addresses, MAC addresses, fully qualified domain names, etc.) and automatically generate search and indexing queries that span across multiple tables and columns.


Type

  • Breakout

Experience

  • In Person

Track

  • Data Governance, Databricks Experience (DBX)

Industry

  • Enterprise Technology, Professional Services

Difficulty

  • Intermediate

Duration

  • 40 min
Download session slides

Session Speakers

Headshot of Erni Durdevic

Erni Durdevic

Specialist Solutions Architect (Geospatial)

Databricks

Headshot of David Tempelmann

David Tempelmann

Resident Solutions Architect

Databricks

Don't miss this year's event!

Register now