Bhoomika Sharma

Data Scientist, Megh Computing, Inc.

Bhoomika is currently working as a Data Scientist at Megh Computing, a company focused in delivering an efficient, scalable platform for real time analytics using FPGA accelerators. For the past four years she has been working with Spark. Prior to Megh, Bhoomika was working as a Software Engineer at Monster, an online job portal, where she was responsible for handling machine learning and text analytics workloads such as job recommendation engine. She is interested in developing machine learning applications using Spark ML.

Past sessions

The current uptrend in faster computational power has led to a more mature eco-system for image processing and video analytics. By using deep neural networks for image recognition and object detection we can achieve better than human accuracies. Industrial sectors led by retail and finance want to take advantage of these latest developments in real-time analysis of video content for fraud detection, surveillance and many other applications.

There are a couple of challenges involved in the real word implementation of a video analytics solution:
1) Most video analytics use-cases are effective only when response times are in milliseconds. Requirement of performing at very low latencies gives rise to a need for software and hardware acceleration
2) Such solutions will need wide-spread deployment and are expected to have low TCO. To address these two key challenges we propose a video analytics solution leveraging Spark Structured Streaming + DL framework (like Intel's Analytics-Zoo & Tensorflow) built on a heterogenous CPU + FPGA hardware platform.

The proposed solution provides >3x acceleration in performance to a video analytics pipeline when compared to a CPU only implementation while requiring zero code change on the application side as well as achieving more than 2x decrease in TCO. Our video analytics pipeline includes ingestion of video stream + H.264 decode to image frames + image transformation + image inferencing, that uses a deep neural network. FPGA based solution offloads the entire pipeline computation to the FPGA while CPU only solution implements the pipeline using OpenCV + Spark Structured Streaming + Intel's Analytics-Zoo DL library.

Key Take aways:
1. Optimizing performance of Spark Streaming + DL pipeline
2. Acceleration of video analytics pipeline using FPGA to deliver high throughput at low latency and reduced TCO.
3. Performance data for benchmarking CPU and CPU + FPGA based solution.