Introducing Zipline: An Open Source Feature Engineering Platform
Type
- Session
Format
- Hybrid
Track
- Data Science, Machine Learning and MLOps
Difficulty
- Intermediate
Room
- Moscone South | Upper Mezzanine | 156
Duration
- 35 min
Overview
This talk will introduce Zipline, a declarative feature engineering platform developed at Airbnb, which will be open-sourced in March.
Zipline automatically generates spark pipelines to prepare features for training and inference.
The main focus of this talk is to enable the audience to deploy zipline in their organizations. With that goal, we will walk through the api to be implemented to fully leverage zipline's capabilities.
More specifically, we will introduce the following user facing concepts:
1. Data Source types - Events, Entities
2. Transformations - GroupBy, Join with temporal and snapshot accuracies, sawtooth windows, bucketed features etc.
3. Online-offline consistency interface for monitoring feature quality
4. The zipline configuration repo
For systems integration - we will go over the stream and key value store api - reference implementation of those api's.
Session Speakers
See the best of Data+AI Summit
Watch on demand