Moving to the Lakehouse: Fast & Efficient Ingestion with Auto Loader
- Data Lakes, Data Warehouses and Data Lakehouses
- 0 min
Auto loader, the most popular tool for incremental data ingestion from cloud storage to Databricks’ Lakehouse, is used in our biggest customers’ ingestion workflows. Auto Loader is our all-in-one solution for exactly-once processing offering efficient file discovery, schema inference and evolution, and fault tolerance.
In this talk, we want to delve into key features in Auto Loader, including:
• Avro schema inference
• Rescued column
• Semi-structured data support
• Incremental listing
• Asynchronous backfilling
• Native listing
• File-level tracking and observability
Auto Loader is also used in other Databricks features such as Delta Live Tables. We will discuss the architecture, provide a demo, and feature an Auto Loader customer speaking about their experience migrating to Auto Loader.