Skip to main content

A lakehouse for financial services, featuring a modern architecture and a lake in the background.


Download our Financial Services Guide to Data + AI Summit to help plan your Summit experience.


Every year, data leaders, practitioners and visionaries from across the globe and industries join the Data + AI Summit to discuss the latest trends in big data. For data teams in the financial services industry, we're excited to announce a full agenda of Financial Services sessions. Leaders from Capital One, J.P. Morgan, HSBC, Nasdaq, TD Bank, S&P Global, Nationwide, Northwestern Mutual, BlockFi (crypto) and many more will share how they are using data and machine learning (ML) to digitally transform and make smarter decisions that minimize risk, accelerate innovation and drive sustainable value creation.

Financial Services Forum

Data is at the core of nearly every innovation in the financial services industry. Leaders across banking and capital markets, payment companies and fintechs, insurance and wealth management firms are harnessing the power of data and analytics.

Join us on Tuesday, June 28 at 3:30 PM PT for our Financial Services Forum, our most popular industry event at Data + AI Summit. During our capstone event, you'll have the opportunity to join sessions with thought leaders from some of the biggest global brands.

Featured Speakers:
Jack Berkowitz, Chief Data Officer, ADP
Junta Nakai, Global Industry Lead, Financial Services, Databricks
Paul Wellman, VP, Executive Product Owner, TD Bank
Arup Nanda, Managing Director, CTO Enterprise Cloud Data Ecosystem, J.P. Morgan
Geping Chen, Head of Data Engineering, Geico
Mona Soni, Chief Technology Officer, Sustainable1, S&P Global
Jeff Parkinson, VP, Core Data Engineering, Northwestern Mutual
Christopher Darringer and Shraddha Shah, Point72 Asset Management
Ken Priyadarshi, Global Strategy and Transactions CTO, EY

Financial Services Breakout Sessions

Here's an overview of some of our most highly anticipated Financial Services sessions at this year's summit:

HSBC: Cutting the Edge in Fighting Cybercrime — Reverse-Engineering a Search Language to Cross-Compile It to PySpark
Abigail Shriver, HSBC | Jude Ken-Kwofie, HSBC | Serge Smertin, Databricks

Traditional security information and event management (SIEM) tools do not scale well for data sources with 30TB per day, which led HSBC to create a Cybersecurity Lakehouse with Delta Lake and Apache Spark. In this talk, you'll learn how to implement (or reverse-engineer) a language with Scala and translate it into what Spark understands, the Catalyst engine.

Toward Dynamic Microstructure: The Role of ML in the Next Generation of Exchanges
Michael O'Rourke, SVP, Engineering & AI/ML, Nasdaq | Douglas Hamilton, AVP, Machine Intelligence Lab)

What role will AI and ML in ensuring the efficiency and transparency of the next generation of markets? In this session, Douglas and Michael will show how Nasdaq is building dynamic microstructures that reduce the inherent frictions associated with trading, and give insights into their application across industries.

FutureMetrics: Using Deep Learning to Create a Multivariate Time Series Forecasting
Matthew Wander, Data Scientist, TD Bank

Liquidity forecasting is one of the most essential activities at any bank. TD Bank, the largest of the Big Five based in Canada, has to provide liquidity for half a trillion dollars in products, and forecast it to remain within a $5BN regulatory buffer. The use case was to predict liquidity growth over short to moderate time horizons: 90 days to 18 months. Models must perform reliably in a strict regulatory framework, and accordingly, validating such a model to the required standards is a key area of focus for this talk.

Domain-Driven Data (3D) Lakehouse for Insurance
Kiran Karnati, AVP, Data Management, Enterprise Data Office, Nationwide Insurance

What is 3D lakehouse? Your data lakehouse is only as strong as the weakest data pipeline flowing through it. In this talk, Kiran explains how the most successful lakehouse implementations are those with non-monolithic, modularized data domain products, implemented as a unified trusted data platform enabling business intelligence, AI/ML and downstream consumption use cases, all from the same platform.

Protecting Personally Identifiable Information (PII)/PHI Data in Data Lake via Column Level Encryption
Keyuri Shah, Lead Engineer, Northwestern Mutual Insurance

Data breach is a concern for any data collection company, including Northwestern Mutual. Every measure is taken to avoid identity theft and fraud for customers; however, these preventive methods are still not sufficient if the security perimeter around it is not updated periodically. A multiple layer of encryption is the most common approach utilized to avoid breaches, but unauthorized internal access to this sensitive data still poses a threat.

How Robinhood Built a Streaming Lakehouse to Bring Data Freshness From 24 Hours to Less Than 15 Minutes
Balaji Varadarajan, Robinhood Markets | Vikrant Goel, Robinhood

Robinhood's data lake is the bedrock foundation that powers business analytics, product experimentation and other machine learning applications throughout the organization. Come join this session where the speakers share their journey of building a scalable streaming data lakehouse with Spark, Postgres and other leading open source technologies.

Building an Operational ML Organization From Zero for Cryptocurrency
Anthony Tellez, BlockFi | Brennan Lodge, BlockFi

BlockFi is a cryptocurrency platform that allows its clients to grow wealth through various financial products capabilities, including loans, trading and interest accounts. In this presentation the speakers showcase their journey of adopting Databricks to build an operational nerve center for analytics across the company.

A Modern Approach to Big Data in Finance
Bill Dague, Nasdaq | Leonid Rosenfeld, Nasdaq

In this live demonstration of Delta Sharing combined with Nasdaq Data Fabric, the speakers address the unique challenges associated with working with big data for finance (volume of data, disparate storage, variable sharing protocols). Leveraging open source technologies, like Databricks Delta Sharing, in combination with a flexible data management stack allows Nasdaq to be more nimble in testing and deploying more strategies.

Running a Low-Cost, Versatile Data Management Ecosystem With Apache SparkTM at Core
Shariff Mohammed, Capital One

This presentation demonstrates how Capital One built an ETL data processing ecosystem completely on AWS cloud using Spark at its core. While data engineers are required to be skilled to code in one programming language (Apache Spark), pipeline code can be executed on AWS EC2 or EMR to optimize distributed computing. This presentation also demonstrates how a UI-based ETL tool built with Spark as a back-end can run on the same infrastructure, which improves ease of development and maintenance.

Check out the full list of Financial Services talks at Summit.

Demos on Popular Data + AI Use Cases in Financial Services

Hyper-Personalization at ScaleRapidly Deploy Data Into Value-at-Risk ModelsClaims AutomationMetadata Ingestion Framework

Sign up for the Financial Services Experience at Summit!

Try Databricks for free

Related posts

See all Industries posts