CyberMLToolkit: Anomaly Detection as a Scalable Generic Service Over Apache Spark

Download Slides

Cybercrime is one the greatest threats to every company in the world today and a major problem for mankind in general. The damage due to Cybercrime is estimated to be around $6 Trillion By 2021. Security professionals are struggling to cope with the threat. As a result, powerful and easy to use tools are necessary to aid in this battle. For this purpose we created an anomaly detection framework focused on security which can identify anomalous access patterns. It is built on top of Apache Spark and can be applied in parallel over multiple tenants. This allows the model to be trained over the data of thousands of customers over a Databricks cluster within less than an hour. The model leverages proven technologies from Recommendation Engines to produce high quality anomalies. We thoroughly evaluated the model’s ability to identify actual anomalies by using synthetically generated data and also by creating an actual attack and showing that the model clearly identifies the attack as anomalous behavior. We plan to open source this library as part of a cyber-ML toolkit we will be offering.


Try Databricks
See More Spark + AI Summit Europe 2019 Videos

« back
Roy Levin
About Roy Levin


Roy Levin received his Ph.D. from the Department of Computer Science at the Technion Israel Institute of Technology in 2013. He is currently a Senior Researcher at Microsoft and part of Azure Security Center (ASC). Roy has over 15 years of academic and industry experience in Machine Learning, Data Management and Information Retrieval.