Announcing Inference Tables: Simplified Monitoring and Diagnostics for AI models

Published: October 5, 2023

by Ahmed Bilal, Steven Chen, Kasey Uhlenhuth, Joshua Hartman, Alkis Polyzotis and Viswesh Periyasamy

Have you ever deployed an AI model, only to discover it's delivering unexpected results in a real-world setting? Monitoring models is as crucial as their deployment. That's why we're excited to introduce Inference Tables to simplify monitoring and diagnostics for AI models. Inference Tables enable you to continuously capture input and predictions from Databricks Model Serving endpoints and log them into a Unity Catalog Delta Table. You can then leverage existing data tools, including Lakehouse Monitoring, to monitor, debug, and optimize your AI models.

Inference Tables are a great example of the value you get when doing AI on a Lakehouse platform. Without any additional complexity and cost, you can enable monitoring on every deployed model. This allows you to detect issues early and take immediate action, such as by retraining, to consistently achieve the best business outcomes from your AI models.

What is an Inference Table?

Inference Tables are a Unity Catalog-managed Delta Table that stores online prediction requests and responses from model serving endpoints. Once enabled, Inference Tables become a powerful tool for various use cases, including monitoring, debugging, creation of training corpora, and compliance audits. Instead of creating separate tools for each purpose, Inference Tables provide a table representation of a model so that consistent governance and monitoring tools can be used across all stages of ML.

You can simply enable Inference Tables on both existing and new model-serving endpoints with a single click. Prediction requests and responses will begin logging into the Unity Catalog Governed table, making it easy to discover, govern, and share these Tables with teams.

What can you do with Inference Tables?

Monitor Model Quality with Lakehouse Monitoring

You can continuously monitor your model performance and data drift using Lakehouse Monitoring—the first monitoring solution designed for both AI and data assets. Lakehouse Monitoring automatically generates data and ML quality dashboards which can be easily shared with stakeholders. Additionally, you can enable alerts to know when you need to retrain your model based on fundamental shifts in data or reductions in model performance. Thanks to this simplification, quality no longer has to remain an afterthought and can be enabled on all endpoints.

Debug Production Issues

Inference Tables simplify the debugging process by logging all the important data you need: HTTP status codes, model execution times, and other endpoint-specific details. The best part? There's no need to learn new tools. You can debug endpoints by analyzing the Inference Table with queries in Databricks SQL or use Python in a notebook for advanced post-analysis.

You can also replay past traffic to new models with the historical data in Inference Tables. With this capability, you can easily generate a side-by-side comparison of model latencies. You can further tie this data into our LLM evaluation suite to evaluate different LLMs for safety and quality.

Create Training Corpus to improve AI models

With your data in the Unity Catalog, you can easily join Inference Tables with ground truth labels to create a robust training corpus for improving your models. If you don't have labels, you can generate them through our partnership with Labelbox and easily combine them with Inference Tables. You can then fine-tune these models or even automate retraining using Databricks Workflows. This continuous feedback loop ensures you always obtain the best predictions from your model.

Next Steps

Dive deeper into the Inference Tables documentation (AWS | Azure).
Learn more about Databricks' approach to Monitoring here

What's next?

November 20, 2024/4 min read

Introducing Predictive Optimization for Statistics

November 21, 2024/3 min read