Ensuring Correct Distributed Writes to Delta Lake in Rust with Formal Verification
- Ingegneria dei dati
- Media and Entertainment
- Moscone South | Upper Mezzanine | 155
- 35 min
Delta-rs is a delta lake implementation in pure Rust. One very liberating feeling working with the Rust language is once the compiler accepts your code, it guarantees the program will not run into any memory bug at runtime. This means the mental load of writing highly concurrent and parallel native programs in Rust is a lot lower compared to other system level programming languages like C and C++.
However, one can still trick the compiler by introducing logical bugs in the code. For data systems like Delta Lake, logical bugs could lead to serious data corruption and quality issues. To guard against this, the delta-rs community decided to introduce formal verification methods in our development workflow. We first started with the widely adopted TLA+ project, then switched to a new and raising model checking tool called stateright.
In this talk I will cover what are the common formal verification tools used in distributed system designs and implementations. Then I will discuss our learnings from adopting TLA+ and stateright in delta-rs. How we leverage them to discover nuance bugs and verify our fixes. I will also explain why I think stateright is a much better and more practical formal verification tool for projects like delta-rs compared to the more popular ones like TLA+.
Combining both Rust and formal verification in software development is a very powerful combo because it means you can build softwares that are guaranteed to be free of both memory and logical bugs.