Data + AI Summit, the largest gathering of the open source data and AI community, returned as a hybrid event at the Moscone Center from June 27-30. With incredible breakout sessions by global financial services institutions (FSIs) like Northwestern Mutual, Nasdaq, and HSBC to mainstage keynotes from Intuit and Coinbase and live interactive demos from Databricks solution architects and partners like Avanade, Deloitte, and Confluent, we heard about innovation with data and AI in ways previously unimagined in financial services.
Financial Services Forum
For our financial services attendees, the most exciting part of Data + AI Summit was the Financial Services Forum – a two-hour event that brought together leaders from across global brands in banking, insurance, capital markets and fintech to share innovative data and AI use cases from their data journeys. With the theme of “The Future of Financial Services Is Open With Data and AI at Its Core,” Junta Nakai, Global Head of Financial Services and Sustainability Leader at Databricks, spoke of the need for FSIs to have a short and clear path to “AI in action” to preserve margins, find new revenue streams, and shorten development timelines, given today’s prolonged inflationary environment. He also gave an overview of the Databricks Lakehouse for Financial Services, a single platform that brings together all data and analytics workloads to power transformative innovations in modern financial services institutions.
In their opening keynote, TD Bank’s Paul Wellman (Executive Product Owner, Data as a Service) and Upal Hossain (AVP, Data as a Service), shared their data transformation journey and accelerated transition to the cloud, migrating over 100 million files and ~8,000 ETL jobs at petabyte scale with Delta Lake and the Azure cloud.
Later, Geping Chen, Head of Data Engineering from GEICO, talked about personalization and the use of Telematics IoT in auto insurance as the biggest upcoming trend in the financial services industry – among other exciting industry topics.
Attendees also learned best practices for achieving business outcomes with data + AI regarding people, process, and technology from Jack Berkowitz, Chief Data Officer, ADP; Jeff Parkinson, VP, Core Data Engineering, Northwestern Mutual; Ken Priyadarshi, AI Leader, EY Global; Gary Jones, Chief Data Engineer and Mona Soni, Chief Technology Officer, S&P Global Sustainable1; Arup Nanda, Managing Director, CTO Enterprise Cloud Data Ecosystem, JP Morgan; Christopher Darringer, Lead Engineer and Shraddha Shah, Data Engineer at Point 72 Asset Management.
Financial Services Breakout Sessions and Demos
Check out these financial services breakout sessions to hear from our customers about the business benefits, cost and productivity savings, and advanced analytics they’re now able to realize with Databricks:
- Nasdaq: A Modern Approach to Big Data
- HSBC: Cutting the Edge in Fighting Cybercrime
- Capital One: Running a Data Management System
- How Robinhood Built a Streaming Lakehouse
- Deloitte: AI Fueled Forecasting, The Next Generation of Financial Planning
5 Key Announcements That Will Transform the Financial Services Industry
1. Integrated data governance with Databricks Unity Catalog (GA expected in coming weeks after DAIS).
We announced the upcoming GA of Unity Catalog (UC), which allows customers to enable fine-grained access controls on data and meet their privacy requirements. Unity Catalog is the catalog and governance layer for Databricks Lakehouse and offers a range of capabilities, including:
- The best way to secure access to data in Databricks across all compute/language types
- The best way to allow for secure data sharing, powered by Delta Sharing on Databricks
- A single source of truth for data and access control across Databricks workspaces
- Easy to control with SQL based grants to give people/groups/principals access to data
- API-driven to help with easy automation and workflow processes
For financial institutions, UC provides the ability to centralize catalog management at the account level (i.e., across multiple clouds). Unity Catalog also makes it easy to automate discovery and lineage – automated lineage is something even the biggest players in the cataloging space still struggle with.
2. MLfLow 2.0 – Personalization gets a boost from model serving and MLflow Pipelines, making model development and deployment fast and scalable.
MLflow Pipelines enable Data Scientists to create production-grade ML pipelines that combine modular ML code with software engineering best practices to make model development and deployment fast and scalable. These new features around model monitoring will be impactful for FSIs. It’s common for many financial institutions to have a significant number of models, especially considering the global scale for retail and institutional businesses. It becomes impossible to do proper model drift monitoring without an automated framework.
MLflow Pipelines will help improve model governance frameworks because FSIs can now apply CI/CD practices around constructing and managing ML model infrastructure setup.
3. Delta Lake will be fully open source.
Delta Lake 2.0 lowers the barrier to entry for adopting a Lakehouse architecture. As organizations think about on-prem or Hadoop migrations to the Databricks Lakehouse, they can use a consistent foundation to make Lakehouse a simpler transition. Moving to the Lakehouse has never been easier and for workloads that are not yet on Databricks, organizations can show the total cost of ownership (TCO) savings. Other benefits include:
- With Delta Lake 2.0, users can now reap the benefits of better performance (3.5x better overall performance compared to other solutions) to save on computation costs
- Delta Lake 2.0 offers an unrivaled level of maturity and proven real-world performance (663% increase in contributor strength over the past 3 years)
- Delta Lake 2.0 will open source all APIs, including OPTIMIZE and ZORDER – FSIs are no longer forced to sacrifice performance and choose alternatives like Apache Iceberg and due to limited functionality
- Anyone can now achieve a simplified architecture with Delta’s all-encompassing framework – no need to leverage third-party services for features like data sharing
4. Next generation streaming–Project Lightspeed–is a game-changer for Financial Services, leveraging fresh data for insight generation.
This announcement was one of the major streaming announcements Databricks has made – although streaming has always been a large and successful part of our business for improving performance to achieve higher throughput, lower latency, and lower cost. The announcement includes improving ecosystem support for connectors, enhancing functionality for processing data with new operators and APIs, simplifying deployment, operations, monitoring and troubleshooting.
For example, two major themes in Financial Services that are driving business today are personalization and regulatory reporting. The first category includes use cases from personalized insurance pricing to next-best-action. Regulatory reporting may include trade reporting or clearing and settlement. The key to unlocking the use cases above is the ability to stream data sources and process them in near real-time. Any FSI looking to make advances on these use cases will require these cutting-edge, native streaming capabilities.
5. Databricks Marketplace helps data consumers turn data into insights quicker and supports data providers’ growth as they distribute and monetize data assets
Databricks Marketplace is an open marketplace for exchanging data assets such as datasets, notebooks, dashboards, and machine learning models. To accelerate insights, data consumers can discover, evaluate, and access more data products from third-party vendors than ever before.
Financial Services institutions can accelerate projects to monetize data and build alternative streams of revenue (e.g. monetizing unique datasets and models) more seamlessly to a broad audience. The Databricks Marketplace will set the stage for FSIs to accelerate having data as an asset on the balance sheet.
Beyond these featured announcements, there were other exciting announcements like Databricks Serverless Model Endpoints, Project Enzyme, Delta Sharing and Data Cleanrooms. We encourage you to check out the Day 1 and Day 2 keynotes to learn more about our product announcements.