Speed up LLM Development with Gretel

Register or Login


TYPELightning Talk
TRACKGenerative AI

With most teams being data-poor (no data, no access to data, messy data, insufficient data), access to high-quality data for training and fine-tuning Large Language Models (LLMs) is a critical challenge in building generative AI applications for the Enterprise. Gretel Navigator opens the door to removing data bottlenecks by generating high-quality data on-demand and at scale via its multimodal synthetic data platform. This helps teams innovate safely and quickly, reduces time in bringing AI solutions to market, and substantially lowers the cost of AI development by providing a complementary/alternative approach to human annotation.


Using Gretel Navigator, developers can design and iterate on data from scratch or augment existing data and turn it into a safe, high-utility resource. Developers no longer have to jump through hoops to wrangle messy data, resolve permissions, break down silos or wait for time-consuming and expensive annotation to finish. Instead, they can just start building. Underpinning the quality and the ease of data generation is a compound AI system that leverages agentic workflows and a multitude of tools, including purpose-built data generation models, custom LLMs, models optimized for differential privacy, and a variety of data-processing and augmentation tools.


We’ll touch on how Databricks leveraged Gretel to generate synthetic data and dive into Gretel leveraging its own tooling to continuously improve Gretel Navigator. We’ll also discuss the recently released synthetic Text-to-SQL dataset generated by Gretel. This open-source dataset quickly became the #1 trending dataset on Hugging Face, reinforcing the need for high-quality, accessible data in the market. It also highlighted the massive potential for the synthetic pillar of data-centric AI to unlock new capabilities in LLM development.


Yev Meyer

/Chief Scientist