We present HTTP on Spark, a novel integration between Spark with the widely used Hypertext Transfer Protocol (HTTP). This library can be used to integrate any framework into the Spark ecosystem that is capable of communicating through HTTP. Furthermore, HTTP on Spark enables distributed and fault tolerant micro service architectures that commute with Spark’s dynamic allocation and Streaming capabilities. We build upon this work and release a library of idiomatic spark bindings for a wide array of Microsoft Cognitive Services. These bindings allow users to easily add *any* cognitive service as a part of their existing Spark and SparkML machine learning pipelines. Finally, we demonstrate how to use these services to create a large class of custom image classification and object detection systems that can learn without requiring human labeled training examples. We demonstrate the power of these new releases with an automated Snow Leopard Detection system.
Anand is the GM and Chief of Staff for Microsoft AI. Previously he was the Chief of Staff for Microsoft Azure Data Group covering Data Platforms and Machine Learning. In the last decade, he ran the product management and the development teams at Azure Data Services, Visual Studio and Windows Server User Experience teams at Microsoft. Anand holds a PhD in Computational fluid mechanics and worked several years as researcher before joining Microsoft.
Mark is a software engineer on Microsoft’s Applied AI team and a machine learning PhD student at the MIT Computer Science and AI Lab. Mark leads Microsoft ML for Apache Spark (http://aka.ms/spark), a distributed machine learning and microservice orchestration library. He has applied this work to problems in wildlife conservation, accessibility, and art museum outreach. Mark is currently researching how information theory and abstract algebra can yield new deep learning architectures in professor William T Freeman’s lab.