Managing Data at Exabyte Scale for AI Model Training
Overview
| Experience | In Person |
|---|---|
| Track | Artificial Intelligence & Agents |
| Industry | Enterprise Technology |
| Technologies | AI/BI |
| Skill Level | Intermediate |
More and more enterprises are post-training AI models to be custom tailored to their domain and users. Success in model training depends critically on establishing the right data flywheel to improve iteration speed. From exploration to curation, from feature engineering to training, current data infrastructure solutions are insufficient to support the full iteration loop. Model development teams wrestle with a hodgepodge of single-purpose tools, copying data across systems for different workflow steps. This is a key reason why so many enterprise AI model training efforts are delayed or fail. In this talk, we’ll cover how LanceDB solves this by managing all of your multimodal training data in a unified table and streamlining the entire exploration to GPU-loading path. Through customer case studies, we’ll outline this new architecture, practical recipes for success and how it all fits into your existing enterprise data architecture.