For companies that solve real-world problems and generate revenue from the data science products, being able to understand why a model makes a certain prediction can be as crucial as achieving high prediction accuracy in many applications. However, as data scientists pursuing higher accuracy by implementing complex algorithms such as ensemble or deep learning models, the algorithm itself becomes a blackbox and it creates the trade-off between accuracy and interpretability of a model’s output.
To address this problem, a unified framework SHAP (SHapley Additive exPlanations) was developed to help users interpret the predictions of complex models. In this session, we will talk about how to apply SHAP to various modeling approaches (GLM, XGBoost, CNN) to explain how each feature contributes and extract intuitive insights from a particular prediction. This talk is intended to introduce the concept of general purpose model explainer, as well as help practitioners understand SHAP and its applications.
My name is Layla Yang. I am a Solutions Architect at Databricks. Before databricks I started my career in AdTech industry focusing on building Machine Learning models and data products. I spent few years at adtech startups to design, build and deploy automated predictive algorithm into production for real-time bidding (RTB) plugged in major Ad Exchange and SSPs. My work also included MMM (media mix modeling), DMP user segmentation and customer recommendation engine. Currently I work with start-ups in the NYC and Boston area to scale their existing data engineering and data science efforts leveraging Apache Spark technology. I studied physics back in university and I love skiing.