SESSION

Best Practices for Data Prep for GenAI Development

Accept Cookies to Play Video

OVERVIEW

EXPERIENCEIn Person
TYPELightning Talk
TRACKGenerative AI
TECHNOLOGIESDatabricks Experience (DBX), Apache Spark, ETL, GenAI/LLMs
SKILL LEVELIntermediate
DURATION20 min
DOWNLOAD SESSION SLIDES

In this session, we will explore the best practices for data preparation for generative AI development. Data preparation is a critical step in the development of generative AI models, as the quality and relevance of the data used for training directly impact the performance and accuracy of the model. We will discuss the importance of data quality, data diversity, and data labeling in the context of generative AI development. We will also cover techniques for data preprocessing, such as data cleaning, normalization, and transformation, and how to optimize these techniques for generative AI models. We will also provide practical tips and guidelines for implementing these best practices in real-world generative AI development projects. Whether you are a data scientist, machine learning engineer, or AI researcher, this session will provide valuable insights and practical guidance for optimizing data preparation for generative AI development.

SESSION SPEAKERS

Brian Kihoon Lee

/Senior Software Engineer
Databricks