Machine Learning is everywhere, but translating a data scientist’s model into an operational environment is challenging for many reasons. Models may need to be distributed to remote applications to generate predictions, or in the case of re-training, existing models may need to be updated or replaced. To monitor and diagnose such configurations requires tracking many variables (such as performance counters, models, ML algorithm specific statistics and more).
In this talk we will demonstrate how we have attacked this problem for a specific use case, edge based anomaly detection. We will show how Spark can be deployed in two types of environments (on edge nodes where the ML predictions can detect anomalies in real time, and on a cloud based cluster where new model coefficients can be computed on a larger collection of available data). To make this solution practically deployable, we have developed mechanisms to automatically update the edge prediction pipelines with new models, regularly retrain at the cloud instance, and gather metrics from all pipelines to monitor, diagnose and detect issues with the entire workflow. Using SparkML and Spark Accumulators, we have developed an ML pipeline framework capable of automating such deployments and a distributed application monitoring framework to aid in live monitoring.
The talk will describe the problems of operationalizing ML in an Edge context, our approaches to solving them and what we have learned, and include a live demo of our approach using anomaly detection ML algorithms in SparkML and others (clustering etc.) and live data feeds. All datasets and outputs will be made publicly available.
Session hashtag: #MLSAIS18
Nisha Talagala is CTO/VP of Engineering at ParallelM, an early stage startup focused on Production Machine Learning and Deep Learning. Previously a Fellow/Lead Architect at SanDisk/Fusion-io, Nisha has more than 15 years of expertise in machine learning, software development, distributed systems, persistent memory, and flash. She has been technology lead for server flash at Intel and CTO of Gear6. Nisha has 53 patents, serves on industry/academic program committees, and is a frequent speaker at industry conferences, academic events and meetups.
Vinay is a Senior Engineer at ParallelM where he works on the distributed layer of the MLOps System intended to automate production ML. In the past he has helped accelerate big data systems like Cassandra and Redis by adopting latest IO technologies. He specializes in large scale, distributed systems and web technologies.