Session

From Unstructured Data to Structured Insights: How YipitData Scales Data Enrichment with AI Agents

Overview

ExperienceIn Person
TrackArtificial Intelligence & Agents
IndustryEnterprise Technology, Retail & Consumer Goods, Financial Services
TechnologiesDatabricks Apps, Agent Bricks, Lakebase
Skill LevelIntermediate

YipitData analyzes billions of unstructured data points at petabyte scale to deliver high fidelity insights to institutional investors and Fortune 500 companies. With 100s of heterogeneous data sources, regex, classic ML and NLP techniques never met our accuracy hurdle, limiting our product breadth for years.In this session, we reframe entity resolution as an agentic problem and share our production-grade enrichment platform built on Apache Spark™, Agent Bricks, Vector Search and Lakebase. This AI-native architecture continuously discovers and tags data at 90%–95% accuracy, reliably covering 60,000+ companies—a 20x improvement.Data/ML leaders, engineers and practitioners will learn:

  • Modular, pipeline design applicable to any classification scenario
  • Batch inference patterns in Spark that streamlines infrastructure
  • Techniques for continuous, low maintenance entity discovery

Join us and leave with a blueprint to turn enrichment bottlenecks into self-improving, AI pipelines.

Session Speakers

Speaker placeholderIMAGE COMING SOON

Anup Segu

/Chief Architect
YipitData

Speaker placeholderIMAGE COMING SOON

Edward Goo

/Senior Director of Data Engineering
Yipitdata