Using Crowdsourced Images to Create Image Recognition Models with Analytics Zoo using BigDL

Download Slides

Volunteers around the world increasingly act as human sensors to collect millions of data points. A team from the World Bank trained deep learning models, using Apache Spark and BigDL, to confirm that photos gathered through a crowdsourced data collection pilot matched the goods for which observations were submitted.

In this talk, Maurice Nsabimana, a statistician at the World Bank, will demonstrate a collaborative project to design and train large-scale deep learning models using crowdsourced images from around the world. BigDL is a distributed deep learning library designed from the ground up to run natively on Apache Spark. It enables data engineers and scientists to write deep learning applications in Scala or Python as standard Spark programs-without having to explicitly manage distributed computations. Attendees of this session will learn how to get started with BigDL, which runs in any Apache Spark environment, whether on-premises or in the Cloud.

Attendees will also learn how to write a deep learning application that leverages Spark to train image recognition models at scale.

Session hashtag: #DL8SAIS

« back
About Maurice Nsabimana

Maurice Nsabimana works as a Statistician focusing on national accounts and macroeconomic indicators in the World Bank's Development Data Group. Nsabimana has previously worked in the private sector, civil society, and at a think-tank. His research interests lie at the intersection of computational economics, machine learning and public policy, and in the development of new, practical methods and information technologies that can be directly applied to strengthen local capacity. Nsabimana holds an M.A. in international affairs from the School of International and Public Affairs at Columbia University and a B.Sc. in Computer Science from Vesalius College in Brussels, Belgium.

About Jiao Wang

Jiao Wang is a software engineer on the Big Data Technology team at Intel who works in the area of big data analytics. She is engaged in developing and optimizing distributed deep learning frameworks on Apache Spark.