HomepageData + AI Summit 2022 Logo
Watch on demand

Introducing Zipline: An Open Source Feature Engineering Platform

On Demand

Type

  • Session

Format

  • Hybrid

Track

  • Data Science, Machine Learning and MLOps

Difficulty

  • Intermediate

Room

  • Moscone South | Upper Mezzanine | 156

Duration

  • 35 min
Download session slides

Overview

This talk will introduce Zipline, a declarative feature engineering platform developed at Airbnb, which will be open-sourced in March.
Zipline automatically generates spark pipelines to prepare features for training and inference.

The main focus of this talk is to enable the audience to deploy zipline in their organizations. With that goal, we will walk through the api to be implemented to fully leverage zipline's capabilities.
More specifically, we will introduce the following user facing concepts:
1. Data Source types - Events, Entities
2. Transformations - GroupBy, Join with temporal and snapshot accuracies, sawtooth windows, bucketed features etc.
3. Online-offline consistency interface for monitoring feature quality
4. The zipline configuration repo

For systems integration - we will go over the stream and key value store api - reference implementation of those api's.

Session Speakers

Nikhil Simha Raprolu

Staff software engineer

Airbnb

See the best of Data+AI Summit

Watch on demand