Significant amount of effort is required to transform a machine learning (ML) model into a useful machine learning product. The incorporation of ML into real world applications almost feels like “1% algorithm and 99% perspiration”. I will share with you my team experience in building 3 ML products at Zendesk. I will also discuss some real-world problems and scaling complexities you may encounter when building these products at web scale. Close collaboration with different groups including product, engineering and data science is imperative to strike the balance between model performance, scalability and computational efficiency. The talk mainly focuses on scaling our model building infrastructure with an aim to build at least 50,000 models a day. This is achieved as part of our efforts to deliver a ML product called Content Cues.
In a nutshell, Content Cues summarizes text from customers support tickets to form insightful topics. It combines multiple ML algorithms including deep learning, clustering and other natural language processing approaches. These ML algorithms are then run through tens of thousands of eligible Zendesk customer data every day. My talk will cover the following topics: How we implement a horizontally scalable model building and model serving pipeline by combining AWS EMR, AWS Batch and Kubernetes How we tune the model building pipeline to optimize cost and efficiency without compromising resiliency Challenges in model monitoring, model versioning evolution and capturing of user feedback
Wai Chee is a technical lead for a machine learning product team at Zendesk. She is a polyglot developer who loves working with data and machine learning. She has ten years experience in data processing, distributed systems, API, and web applications. She holds a PhD in computer vision. She is a mum to a cheeky miniature schnauzer. In her spare time she likes to explore dog training techniques and savour street food from all around the world.