Skip to main content
Generative AI

EnterprisePII is a first-of-its-kind large language model (LLM) data set aimed at detecting business-sensitive information.

The challenge of detecting and redacting sensitive business data is a significant issue for enterprises that want to leverage generative AI capabilities. The risk of LLMs unintentionally leaking confidential information to the public, third parties, or unauthorized internal users has been well-documented and hinders enterprise adoption.

Traditional PII detection models rely on Named Entity Recognition (NER) and only identify Personally Identifiable Information (PII), like addresses, phone numbers, or personal details. However, they fall short in detecting crucial business-sensitive information such as revenue figures, customer accounts, salary details, project owners, and strategic or commercial relationship notes.

That’s why Patronus AI developed EnterprisePII—a first-of-its-kind large language model (LLM) data set aimed at detecting business-sensitive information. AI researchers and developers can now freely access and utilize EnterprisePII to test their LLMs' ability to identify confidential data typically found in various business documents, such as meeting notes, commercial contracts, marketing emails, performance reviews, and more.

In the coming weeks, MosaicML will incorporate the EnterprisePII dataset into LLM Foundry, an open-source code repository for training, fine-tuning, evaluating, and deploying LLMs. A version of the dataset compatible with our Composer library will also be included in our LLM Eval Gauntlet (a comprehensive method for assessing the quality of LLMs).

To learn more about Patronus AI and EnterprisePII, read their latest announcement.

Try Databricks for free

Related posts

Generative AI

Blazingly Fast LLM Evaluation for In-Context Learning

February 2, 2023 by Jeremy Dohmann in Generative AI
With MosaicML you can now evaluate LLMs on in-context learning tasks (LAMBADA, HellaSwag, PIQA, and more) hundreds of times faster than other evaluation...
Generative AI

End-to-End Secure Evaluation of Code Generation Models

August 10, 2023 by Rishab Parthasarathy in Generative AI
With MosaicML, you can now evaluate LLMs and Code Generation Models on code generation tasks (such as HumanEval, with MBPP and APPS coming...
See all Generative AI posts