HomepageData + AI Summit 2022 Logo
Watch on demand

Moving to the Lakehouse: Fast & Efficient Ingestion with Auto Loader

On Demand


  • Session


  • Virtual


  • Data Lakes, Data Warehouses and Data Lakehouses


  • Intermediate


  • 0 min


Auto loader, the most popular tool for incremental data ingestion from cloud storage to Databricks’ Lakehouse, is used in our biggest customers’ ingestion workflows. Auto Loader is our all-in-one solution for exactly-once processing offering efficient file discovery, schema inference and evolution, and fault tolerance.

In this talk, we want to delve into key features in Auto Loader, including:

• Avro schema inference

• Rescued column

• Semi-structured data support

• Incremental listing

• Asynchronous backfilling

• Native listing

• File-level tracking and observability

Auto Loader is also used in other Databricks features such as Delta Live Tables. We will discuss the architecture, provide a demo, and feature an Auto Loader customer speaking about their experience migrating to Auto Loader.

Session Speakers

Benyue Liu


Eric Maynard


Das Beste des Data+AI Summits anzeigen

Watch on demand