Jeshua has over 10 years of experience in machine learning, and is an expert in building technology products powered by artificial intelligence. He’s a founding member of Abnormal Security and head of machine learning. Prior to joining Abnormal, Jeshua played a critical role in building Twitter’s machine learning platform, and detecting and preventing abusive behavior on the Twitter product. Prior to Twitter, Jeshua built the AI engine that powered TellApart’s advertising product and bringing it to over $100 million ARR and acquisition by Twitter and integration. His academic experience is in theoretical reinforcement learning and deep learning at the University of Michigan.
May 27, 2021 03:50 PM PT
Detecting advanced email attacks at scale is a challenging ML problem, particularly due to the rarity of attacks, adversarial nature of the problem, and scale of data. In order to move quickly and adapt to the newest threat we needed to build a Continuous Integration / Continuous Delivery pipeline for the entire ML detection stack. Our goal is to enable detection engineers and data scientists to make changes to any part of the stack including joined datasets for hydration, feature extraction code, detection logic, and develop/train ML models.
In this talk, we discuss why we decided to build this pipeline, how it is used to accelerate development and ensure quality, and dive into the nitty-gritty details of building such a system on top of an Apache Spark + Databricks stack.