In this talk, we will explore how Uber enables rapid experimentation of machine learning models and optimization algorithms through the Uber’s Data Science Workbench (DSW). DSW covers a series of stages in data scientists’ workflow including data exploration, feature engineering, machine learning model training, testing and production deployment. DSW provides interactive notebooks for multiple languages with on-demand resource allocation and share their works through community features.
It also has support for notebooks and intelligent applications backed by spark job servers. Deep learning applications based on TensorFlow and Torch can be brought into DSW smoothly where resources management is taken care of by the system. The environment in DSW is customizable where users can bring their own libraries and frameworks. Moreover, DSW provides support for Shiny and Python dashboards as well as many other in-house visualization and mapping tools.
In the second part of this talk, we will explore the use cases where custom machine learning models developed in DSW are productionized within the platform. Uber applies Machine learning extensively to solve some hard problems. Some use cases include calculating the right prices for rides in over 600 cities and applying NLP technologies to customer feedbacks to offer safe rides and reduce support costs. We will look at various options evaluated for productionizing custom models (server based and serverless). We will also look at how DSW integrates into the larger Uber’s ML ecosystem, e.g. model/feature stores and other ML tools, to realize the vision of a complete ML platform for Uber.
Session hashtag: #MLSAIS11
Felix is the VP of Engineering at SafeGraph, bringing over 20 years of engineering and 7 years of data experience. He led teams in Uber's Data Platform and was pivotal in rebuilding their open-source program. Previously he spent time at Microsoft and startups. Felix is a strong proponent of open-source; as a Member of the Apache Software Foundation, he works on Apache Spark (data), Apache Zeppelin (notebook), and also helps mentor 6 projects in the Apache Incubator, including geospatial project Apache Sedona, and leading Apache Superset (visualization) to graduate.
Atul Gupte is a Product Manager on the Product Platform team at Uber. He holds a BS in Computer Science from the University of Illinois at Urbana-Champaign. At Uber, he helps drive product decisions to ensure Uber’s data science teams are able to achieve their full potential, by providing access to foundational infrastructure, stable compute resources & advanced tooling to power Uber’s global ambitions. Previously, at Zynga, he spent time building some of the world’s leading social games and also helped build out the company’s mobile advertising platform.