Batches, Streams, and Everything in between: Unifying Batch and Stream Storage with Apache Pulsar and Lakehouse Architectures
- Data Lakes, Data Warehouses and Data Lakehouses
- Moscone South | Level 2 | 202
- 35 min
Delta Lake and Lakehouse architectures have been instrumental technologies in providing a better foundation for dealing with streaming and data deltas via an open-industry standard. The rapid growth of the ecosystem is a testament to the success of this approach. However, challenges still remain in building a data platform that allows teams to process all data via streams, regardless of the age of data, while also being able to view all streams as tables without exporting data out of the streaming system.
In this talk, we will take a hands-on look at how Apache Pulsar is building it’s core storage engine on the concepts of Lakehouse architectures, allowing teams to build data platforms that can manage data over its entire lifecycle and enabling data to be consumed as either a stream or a table. With these capabilities, we will show how Pulsar + Delta Lake empowers teams, regardless of toolset, to better focus on driving value from data, not just managing it.