HomepageData + AI Summit 2022 Logo
Watch on demand

State-of-the-Art Natural Language Processing with Apache Spark NLP

On Demand


  • Session


  • In-Person


  • Data Science, Machine Learning and MLOps


  • Intermediate


  • Moscone South | Upper Mezzanine | 151


  • 35 min
Download session slides


This session teaches how & why to use the open-source Spark NLP library. Spark NLP provides state-of-the-art accuracy, speed, and scalability for language understanding by delivering production-grade implementations of recent research advances. Spark NLP is the most widely used NLP library in the enterprise today; provides thousands of current, supported, pre-trained models for 200+ languages out of the box; and is the only open-source NLP library that can natively scale to use any Apache Spark cluster.

We’ll walk through Python code running common NLP tasks like document classification, named entity recognition, sentiment analysis, spell checking, question answering, and translation. The discussion of each task includes the latest advances in deep learning and transfer learning used to tackle it. We’ll also cover new free tools for data annotation, no-code active learning & transfer learning, easily deploying NLP models as production-grade services, and sharing models you’ve trained.

Session Speakers

Headshot of David Talby

David Talby


John Snow Labs

See the best of Data+AI Summit

Watch on demand