Predictive Optimization (PO) enhances the performance of Unity Catalog managed tables by intelligently optimizing data layouts, leading to significant improvements in query performance and reductions in storage costs. Since its General Availability, over 2,400 customers have leveraged PO to achieve optimized data layouts out of the box automatically. The results have been impressive: PO has compacted ~14 PB of data and effectively vacuumed more than 130 PB, showcasing its capability to manage and optimize extensive data volumes efficiently.
Explore how Predictive Optimization within the lakehouse architecture can effectively reduce your storage costs by 2x and enhance query performance by as much as 20x.
Predictive Optimization in Databricks automates table management by leveraging Unity Catalog and the Data Intelligence Platform. This innovative feature currently runs the following optimizations for Unity Catalog managed tables:
Previously, these optimization functions were limited to closed file formats in traditional data warehouses. As the first managed solution to offer table maintenance for open table formats, Predictive Optimization eliminates the need for manual, repetitive table optimization tasks. Tailored specifically for the lakehouse architecture, PO allows data teams to prioritize deriving actionable insights from their data over the overhead of table optimization.
Our AI-driven performance enhancements analyze query patterns alongside data layout, table properties, and performance factors to determine the most impactful optimizations. Predictive Optimization carefully assesses each operation, only running those that deliver cost-effective benefits.
Let’s look at a typical customer workload. After customers ingest data to their tables, PO is able to learn from the query patterns on the data and apply optimizations to both tables.
Read on to see the impact that Predictive Optimization has on these workloads.

