Skip to main content

What is explainable AI (XAI)?

Autonomous software systems powered by LLMs that perceive environments, make decisions, and take actions through reasoning, planning, and tool use

by Databricks Staff

  • Explainable AI (XAI) helps teams understand how models reach decisions, making AI outputs easier to trust, debug, audit and defend.
  • The right XAI method depends on the model type, use case and audience, from SHAP and LIME for feature-level explanations to Grad-CAM, saliency maps and attention visualization for image and language models.
  • XAI methods are useful but imperfect. Most explanations are approximations, so teams should validate results, combine methods and connect explanations to governance workflows through tools like MLflow and Unity Catalog.

Explainable AI, or XAI, refers to techniques that help people understand how an AI system arrived at a specific output. It is especially relevant to machine learning and deep learning, where models learn patterns from data instead of following human-written rules.

As models become more powerful, their decisions can become harder to trace. Deep learning models may contain billions of parameters, making it difficult to understand why they approved a transaction, flagged fraud, denied a loan or detected an abnormality in an MRI. This is often called the “black box” problem.

XAI helps open that box by giving teams ways to evaluate whether a model is:

  • Accurate: Is the model making the right prediction?
  • Fair: Is the model treating groups consistently?
  • Reliable: Can teams understand and trust the outcome?
  • Compliant: Can the organization explain decisions when required?

As AI takes on more consequential decisions, understanding why a model reached an answer matters as much as the answer itself. This article covers the main XAI methods, the techniques data and AI teams rely on and how to choose between them.

Why explainable AI methods matter

Decisions in domains such as lending, hiring, healthcare, fraud detection or insurance can have major consequences for individuals. People have a right to know why their application was rejected, or a transaction flagged or a particular treatment recommended, especially if AI was involved. Lack of transparency isn’t just inconvenient. In many contexts it can be a liability. Here are four practical reasons why XAI methods matter:

  • Trust. Users, regulators and business leaders are more likely to adopt AI when they understand how it works. A model that produces a decision without explanation is harder to accept and harder to defend.
  • Debugging. Developers can find and fix model errors, bias or drift faster when they can see what's driving outputs. Without visibility into model behavior, troubleshooting is guesswork.
  • Compliance. As regulations like the EU AI Act and GDPR Article 22 evolve to address AI-related issues, they are increasingly requiring organizations to explain automated decisions that affect individuals. Article 86 of the EU AI Act specifically requires deployers to provide "clear and meaningful explanations" of AI-driven decisions in certain high-risk contexts.
  • Fairness. XAI methods help surface hidden biases related to race, gender, age or geography before an AI system can affect real people.

Model behavior can also shift over time as real-world data changes. Explainability supports ongoing monitoring.

How XAI methods work

XAI methods generally fall into two categories: models that are explainable by design, and methods that explain a model after the fact. In the first category, the model's structure is simple enough to read directly. Examples include decision trees, linear regressions or rule-based systems.

In the second, the model is too complex to read directly, so a separate technique is applied after training to probe what the model is doing. Example techniques might include running experiments on an already-trained model, approximating the model with something simpler or tracing which inputs had the most influence on a specific output.

In either case, however, the analysis doesn't change the model, it interrogates it.

The basic workflow looks like this:

  1. Choose what to explain: Pick the model and the specific prediction, output or behavior that needs explanation.
  2. Select the right XAI method: Choose based on model type — tree model, neural network or transformer — and whether you need to explain a single prediction or overall behavior.
  3. Run the explanation: Apply the method to analyze how the model reached its output.
  4. Review the result: Interpret the output, such as a feature importance score, a visual heatmap, a simpler proxy model or a what-if scenario.
  5. Match the explanation to the audience: A developer needs feature scores and technical detail. An auditor needs documentation and traceability. An end user needs plain language.

Key terms: interpretability vs. explainability, global vs. local

Before diving into specific methods, there are four terms that come up frequently in XAI discussions and knowing them will help clarify later discussions.

TermWhat it meansExample
Interpretable modelA model that's simple enough for a human to follow on its own — no extra tool needed.A decision tree or linear regression whose logic you can read directly.
Explainable modelA complex model paired with a separate technique that explains the model’s behavior it has been trained.A deep neural network analyzed with SHAP or LIME.
Global explanationDescribes how a model behaves overall, across all inputs."Income and credit score are the top two drivers across all loan decisions."
Local explanationDescribes why a model made one specific prediction."This applicant was denied because their debt-to-income ratio was too high."

What are the main types of explainable AI methods?

XAI methods are typically grouped by how they generate explanations. The three descriptions that follow cover the major techniques currently in use, as well as the trade-offs you have to consider regarding transparency, accuracy and practical fit.

Intrinsically interpretable models

Intrinsically interpretable models are transparent by design. The structure of the model itself reveals how it makes decisions, so no additional tool or technique is required to analyze the model’s logic. Examples include decision trees, which follow a flowchart of yes/no rules you can walk through by hand, and linear and logistic regression, which assigns a numerical weight to each input so you can see exactly how each feature contributes to the output. Generalized additive models and rule-based systems work similarly.

The trade-off here is accuracy. Interpretable models are easy to explain but often less accurate than complex models for hard problems like image recognition or understanding language. However, for highly regulated industries where every decision must be defensible, they're often the default choice.

Post-hoc explanation methods

Post hoc methods are applied after a model is trained. When most people say XAI, this is what they mean. Tools like SHAP, LIME and counterfactuals all qualify.

Post-hoc methods are usually the only option for deep learning models, large language models (LLMs) and other complex systems where the underlying math is too complex to read directly. The trade-off, however, is that post-hoc explanations are approximations, not exact internal calculations.

Visualization-based methods

This category refers to methods that produce a visual output showing which part of the input drove the model's decision. Examples include saliency maps and Grad-CAM, which both highlight which pixels in an image mattered most. Attention visualizations highlight which words in a sentence the model focused on. For image and text models, a heatmap or highlight is often more intuitive than a list of numbers, making these methods especially useful when communicating results to nontechnical stakeholders. Like post-hoc methods, visualization outputs should be treated as informative signals, not definitive proof.

Common explainable AI methods

The table below summarizes the most widely used XAI methods, followed by more detailed descriptions of the five techniques practitioners use most frequently.

MethodScopeModel-agnostic?OutputBest for
SHAPLocal + globalYesNumeric contribution of each feature to a predictionTabular models, tree-based models, broad use
LIMELocalYesA simple surrogate model explaining one predictionQuick local explanations across model types
LRPLocalNo (needs neural net internals)Relevance scores traced back through network layersDeep neural networks, image models
Integrated gradientsLocalNo (needs model gradients)Pixel- or token-level attributionNeural networks, images and text
Saliency maps / Grad-CAMLocalNoHeatmap over an image showing influential regionsComputer vision models
Counterfactual explanationsLocalYes"What would need to change for a different outcome?"Decisions affecting individuals (loans, hiring)
Partial dependence plots (PDP)GlobalYesChart showing how one feature affects predictions on averageUnderstanding overall model behavior
Permutation feature importanceGlobalYesRanked list of which features matter most overallModel debugging, feature selection
AnchorsLocalYes"If-then" rules that lock in a predictionRule-style explanations for end users
TCAVGlobalNoHow much a high-level concept influences predictionsImage models, concept-level audits
Attention visualizationLocalNo (needs transformer internals)Highlighting which tokens the model focused onLLMs, transformers, NLP models

SHapley Additive exPlanations

The XAI method known as SHapley Additive exPlanations (SHAP) assigns each input feature a numeric score showing how much it moved a prediction up or down compared to a baseline. Ask SHAP why a loan was denied and it might tell you that the applicant's debt-to-income ratio reduced the approval probability by 22 points while their employment history added 8. The method is rooted in Shapley values from cooperative game theory, a principled way of distributing credit fairly among contributors, which gives SHAP a stronger theoretical foundation than most alternatives.

Key strengths of SHAP are that it is model-agnostic and it produces both local (single prediction) and global (overall model) explanations. It is also the primary explainability tool supported by Databricks AutoML and MLflow autologging. The trade-off is compute cost. SHAP can be slow on large datasets or complex models, and should be budgeted for accordingly.

Local Interpretable Model-agnostic Explanations

The Local Interpretable Model-agnostic Explanations (LIME) method of XAI picks one prediction you want to understand, then builds a smaller, easy-to-read model to analyze how it generates that prediction. To do this, LIME tweaks the input slightly, many times over, and observes how the model's output changes. It uses those results to fit a simplified surrogate, typically a linear model, that approximates the AI it is analyzing. The output is a ranked list of features and their directional influence on the prediction.

LIME works on any model type and produces one-off explanations quickly. The trade-off is that the explanations can be unstable. Because LIME uses random perturbations, running it twice on the same prediction can produce meaningfully different results, which can be a real concern in high-stakes or contexts where auditing is required.

Counterfactual explanations

A counterfactual explanation answers a direct question: What would have needed to change for the model to make a different decision? For example, the statement, "If your annual income were $10,000 higher, this application would have been approved." That's a counterfactual.

This type of XAI resonates with nontechnical audiences because it is actionable. Counterfactuals fit naturally with how people already think about cause and effect, and they give people something to do with the information. They also work well within regulatory frameworks that include a right to an explanation, such as GDPR Article 22. The trade-off is typically practical. A counterfactual is only useful if the suggested change is realistic and within the person's control. "If you were 10 years younger" is not an actionable explanation.

Saliency maps and Grad-CAM

Saliency maps and Grad-CAM are visual XAI techniques for image-based models. They produce a heatmap overlaid on the original image showing which pixels or regions the model focused on when making its prediction. In a medical imaging context, a Grad-CAM output on an X-ray classification might show the model focused on a certain region of the lung, which is exactly what a radiologist needs to see before trusting the result.

These methods are widely used in computer vision, medical imaging, autonomous systems and industrial quality control. Research has shown that saliency maps can look convincing while not accurately reflecting what the model is doing. Treat them as one signal, not a definitive output.

Attention visualization for LLMs and transformers

Transformer models provide the architecture behind most modern LLMs, and have built-in attention mechanisms that weight how much each input token contributes to each output token. Attention visualizations turn those weights into a highlight map over the text, showing which input words the model relied on most when generating a specific response.

The visualizations are readable without specialized expertise, which makes them one of the more accessible explainability tools for LLMs. They aren’t always a faithful explanation of the final output. Research has found that features with high attention weights don't always accurately reflect the model's actual decision.

REPORT

The agentic AI playbook for the enterprise

How to choose the right XAI method

Choosing the right XAI method depends on the model, the audience and the question you're trying to answer. The following framework can help guide your decision:

  1. Identify the question: Are you explaining one specific prediction (local), overall model behavior (global) or something that would change the outcome (counterfactual)?
  2. Match to model type: For tree-based models on tabular data, start with SHAP. For images, use Grad-CAM or integrated gradients. For text and LLMs, use attention visualization or LIME. For neural networks broadly, consider LRP or integrated gradients. Also see the table of methods above.
  3. Consider the audience: Developers need feature scores and technical detail. Auditors need traceability and documentation. End users need plain-language counterfactuals or simple rules.
  4. Factor in compute cost: SHAP and integrated gradients can be expensive on large datasets. Budget resources accordingly before committing to a method at production scale.
  5. Combine methods: No single XAI method tells the full story. Pairing a global method (like permutation importance) with a local one (like SHAP or LIME) gives a fuller picture of both overall behavior and individual predictions.
  6. Validate the explanation: Sanity-check outputs against domain knowledge. If an explanation contradicts what subject matter experts expect, investigate before trusting it.

Limitations of explainable AI methods

XAI methods are powerful, but they're not perfect. Anyone deploying them in production should understand the limitations.

Explanations are approximations, not definitive

Most post hoc methods such as SHAP, LIME or saliency maps approximate what the model is doing rather than revealing the exact internal computation. Two different methods applied to the same prediction may produce different explanations. Treat XAI outputs as evidence, not proof.

Computational cost

As mentioned, methods like SHAP and integrated gradients can be slow on large datasets or complex models. Running full explanations on every prediction in a high-volume production system may not be feasible, and selectively applying them raises questions about representativeness. Budget compute cost as well as modeling costs when considering which XAI method to choose.

Stability and reliability

Some methods, especially LIME, produce different results from repeated runs of the same prediction due to random sampling in the perturbation process. This instability is a real concern for auditable or regulated contexts. Adversarial attacks can also manipulate post hoc explanations to obscure actual model behavior. While research for countermeasures is ongoing, such attacks are another reason not to treat explanations as tamper-proof.

The accuracy-explainability trade-off

The most interpretable models are often the least accurate on complex problems, and the most accurate models are often the hardest to explain. This isn't a solvable engineering problem, it's a deliberate design choice. Organizations need to assess their priorities. Do they want a less accurate but fully transparent model, or a more accurate black-box model with XAI tooling layered on top? The answer should be driven by the importance of the decision. High-stakes domains such as healthcare, lending or criminal justice often warrant prioritizing explainability even at some cost to raw accuracy.

Real-world applications of explainable AI methods

XAI methods are already in production across regulated and high-stakes industries. Here's how different methods tend to be used across industries:

  • Financial services: Banks use SHAP and counterfactuals to explain loan denials, satisfy fair lending regulations and give applicants actionable next steps.
  • Healthcare: Hospitals use saliency maps and Grad-CAM to verify that medical imaging models are focused on clinically relevant regions, not artifacts in the data.
  • Insurance: Underwriting models use global explanations to demonstrate to regulators which factors drive pricing decisions.
  • Fraud detection: Analysts use local explanations to investigate flagged transactions, reduce false positives and communicate findings to compliance teams.
  • HR and hiring: Counterfactual explanations help organizations audit hiring models for bias and meet emerging AI hiring regulations.
  • Customer churn and marketing: Feature importance methods help teams understand which behaviors predict churn so they can act on the underlying drivers, not just the output.

How Databricks supports explainable AI

MLflow, the open source ML lifecycle platform created by Databricks, supports model tracking, versioning and logging explanation artifacts alongside the model itself. For supported model flavors, MLflow autologging can capture SHAP values and feature importance scores, which keeps explanations attached to the specific model version and training run that produced them. Databricks AutoML also auto-generates SHAP plots and Shapley value notebooks for the models it produces, giving teams a starting point for explainability without manual setup.

Unity Catalog provides the governance layer that makes explanations auditable over time. This layer includes model lineage, versioning, centralized access control and audit logs that let teams trace which data trained which model and who accessed it. Together, MLflow and Unity Catalog give data and AI teams the infrastructure to build explainability into the model lifecycle rather than bolting it on at the end.

Frequently asked questions

Are XAI explanations always accurate?

No. Most XAI methods, especially post hoc techniques like SHAP and LIME, produce approximations of model behavior, not exact reconstructions of internal computation. Two methods applied to the same prediction may yield different explanations. Treat XAI outputs as evidence, not conclusive proof. Validating explanations against domain expertise and combining multiple methods gives a more reliable picture.

What is the difference between XAI and interpretable AI?

Interpretable AI refers to models that are transparent by design and whose structure is simple enough to follow directly. Explainable AI is broader and includes interpretable models, as well as complex black-box models paired with separate techniques that explain their behavior after the fact. An interpretable model doesn't need XAI tools, but an explainable model does.

What is the difference between global and local explanations?

A global explanation describes how the model behaves across all inputs, such as which features matter most overall or what patterns drive predictions in general. A local explanation describes why the model made one specific prediction for one specific input. Both types are useful, and the best XAI practice uses global methods to understand the model and local methods to explain individual decisions.

What's the difference between XAI and responsible AI?

Responsible AI is the broader discipline, which covers fairness, safety, privacy, transparency and accountability across the full AI lifecycle. Explainable AI is the set of methods that make model behavior transparent and auditable. So, explainability is necessary for responsible AI but not sufficient on its own. A model can be explainable and still be biased, unsafe or misused.

Can XAI methods be used on generative AI?

Yes, though the techniques differ from those used on traditional ML models. For LLMs and other transformer-based systems, attention visualization is the most widely used approach. LIME can also be applied to text inputs. That said, generative AI presents harder explainability challenges than tabular or image models because outputs are more varied, context windows are longer and the relationship between input tokens and generated text is more complex. Explainability for generative AI is an active area of research, and current methods should be treated as partial signals rather than complete explanations.

Get started with governed XAI on Databricks

XAI methods give data and AI teams the tools to build systems people can understand, trust and audit. Choosing the right method depends on the model, the audience and the importance of the output decision, but the underlying goal is the same: make AI behavior visible enough to act on with confidence.

Learn more about how Databricks supports responsible, governed AI in our enterprise data governance framework or the Databricks AI governance framework.

Get the latest posts in your inbox

Subscribe to our blog and get the latest posts delivered to your inbox.