ML teams are under pressure to move faster, but fragmented experiment data makes that impossible. When experiment tracking is scattered across workspaces and APIs, even simple questions become hard to answer: Which models are improving? Where are we wasting GPU cycles? How many runs failed this week?
Without unified visibility, ML leaders can’t see performance trends or spot regressions early. The result: slower iteration, higher costs, and models that take longer to reach production.
That’s why we developed MLflow System Tables.
Users can query MLflow experiment run tracking data from the system.mlflow.* tables within Unity Catalog, enabling large-scale queries for experiment data across all workspaces within a region.
Previously, MLflow data only lived within workspace-scoped APIs. To analyze MLflow data at scale, users would need to iterate through workspaces and experiments with many round trip queries to the MLflow API. With system tables, all of your experiment metadata across workspaces can be queried in Unity Catalog. Now you can:
Instead of spending time developing custom solutions to wrangle your data, you can focus on the important part: building better models.
MLflow system tables reflect the data already available from the MLflow UI, presenting them in a structured, queryable form:
ML teams often struggle to understand whether experiments are running successfully across multiple workspaces. Tracking success rates or failure trends means manually checking individual MLflow experiments — a slow, error-prone process that hides instability patterns until it’s too late. Using the runs_latest table, teams can now monitor success ratios across all experiments and set SQL-based alerts to detect when reliability drops below a defined threshold (for example, 90%). This turns manual checks into automated oversight.
Teams can catch failed runs and unstable pipelines hours earlier, saving valuable engineering time and reducing wasted training compute. Reliability metrics can even feed into unified ML observability dashboards that track model performance alongside data quality and infrastructure KPIs.
To kickstart monitoring, we have a starter dashboard to visualize experiment and run details which you can import into your workspace, then tailor to your needs. The dashboard includes tabs to view:
It is also often challenging to understand metrics such as resource utilization and model performance across many experiments, as the data is scattered. System metrics like GPU utilization and model evaluation metrics live within separate runs, making it difficult to understand where resources are wasted or models are underperforming.
By combining the runs_latest and run_metrics_history tables, you can track key metrics across workspaces. The example below computes, per experiment, detailed metrics information from all runs, enabling high-level monitoring of system metrics like GPU utilization along with model metrics.
With this unified view, data scientists can detect anomalies, evaluate training performance, or even join evaluation metrics with online served model data in inference tables for deeper insights. Teams gain visibility into whether compute resources are being used effectively and can catch unusual model behavior earlier, leading to tighter feedback loops, more efficient use of infrastructure, and high-quality models in production.
Finally, while SQL queries are powerful, they’re not always straightforward for everyone who can benefit from understanding ML data. With an AI/BI Genie space, you can add the MLflow system tables as data and start getting insights on your model performance. Notably, Genie translates natural-language questions into equivalent SQL queries for quick exploration and generates relevant visualizations, making it easy for all users. You can prompt it further with follow up questions for deeper analysis.
With all the lakehouse tooling available on top of system tables, it’s easier than ever before to extract insights from your experiment run tracking data. The MLflow System Tables Public Preview is available in all regions and contains data starting from Sept. 2nd. To begin, your account admin needs to use UC tooling such as group privileges or row-level permissions on a dynamic view to grant you read access to the table. (For more details, please see the official docs.) Afterwards, here are 2 easy ways to get started:
We highly recommend exploring everything system tables unlocks for your MLflow data and look forward to your feedback!
