Session

Petabyte-Scale On-Chain Insights: Real-Time Intelligence for the Next-Gen Financial Backbone

Register or Login

Overview

Tuesday

June 10

12:20 pm

ExperienceIn Person
TypeLightning Talk
TrackData Lakehouse Architecture and Implementation
IndustryEnterprise Technology, Financial Services
TechnologiesApache Spark, Delta Lake, Unity Catalog
Skill LevelBeginner
Duration20 min

We’ll explore how CipherOwl Inc. constructed a near real-time, multi-chain data lakehouse to power anti-money laundering (AML) monitoring at a petabyte scale. We will walk through the end-to-end architecture, which integrates cutting-edge open-source technologies and AI-driven analytics to handle massive on-chain data volumes seamlessly. Off-chain intelligence complements this to meet rigorous AML requirements.

 

At the core of our solution is ChainStorage, an OSS started by Coinbase that provides robust blockchain data ingestion and block-level serving. We enhanced it with Apache Spark™ and Arrow™, coupled for high-throughput processing and efficient data serialization, backed by Delta Lake and Kafka. For the serving layer, we employ StarRocks to deliver lightning-fast SQL analytics over vast datasets. Finally, our system incorporates machine learning and AI agents for continuous data curation and near real-time insights, which are crucial for tackling on-chain AML challenges.

Session Speakers

Leo Liang

/CEO / coFounder
CipherOwl Inc