HomepageData + AI Summit 2022 Logo
Watch on demand

Introducing Zipline: An Open Source Feature Engineering Platform

On Demand


  • Session


  • Hybrid


  • Data Science, Machine Learning and MLOps


  • Intermediate


  • Moscone South | Upper Mezzanine | 156


  • 35 min
Download session slides


This talk will introduce Zipline, a declarative feature engineering platform developed at Airbnb, which will be open-sourced in March.
Zipline automatically generates spark pipelines to prepare features for training and inference.

The main focus of this talk is to enable the audience to deploy zipline in their organizations. With that goal, we will walk through the api to be implemented to fully leverage zipline's capabilities.
More specifically, we will introduce the following user facing concepts:
1. Data Source types - Events, Entities
2. Transformations - GroupBy, Join with temporal and snapshot accuracies, sawtooth windows, bucketed features etc.
3. Online-offline consistency interface for monitoring feature quality
4. The zipline configuration repo

For systems integration - we will go over the stream and key value store api - reference implementation of those api's.

Session Speakers

Nikhil Simha Raprolu

Staff software engineer


See the best of Data+AI Summit

Watch on demand