Cutting the Edge in Fighting Cybercrime: Reverse-Engineering a Search Language to Cross-Compile it to PySpark
- Data Lakes, Data Warehouses and Data Lakehouses
- Financial Services
- Moscone South | Upper Mezzanine | 159
- 35 min
In this talk we’ll learn how to implement (or actually reverse-engineer) a language with Scala and translate it into what Apache Spark understands, the Catalyst engine. We’ll guide you through the technical journey - including examples of Databricks Notebooks and code blocks - of building equivalents of a query language into Spark and how to implement another search query language features that are not possible out of the box, like IP CIDR matching or fuzzy matching across all columns. We’ll show you how to use the same framework for PySpark code generation and use-case reconciliation.
We’ll learn how HSBC business benefited from this cutting-edge innovation, like decreasing time and resources for Cyber data processing migration, improving Cyber threat Incident Response (IR), and fast onboarding of HSBC Cyber Analysts on Spark with Cybersecurity Lakehouse platform.