How to use AI functions in Databricks SQL together with AI/BI Dashboards
by Alex Lichen and Ina Felsheim
In this blog post, we’ll follow a step-by-step guide to building a review analysis pipeline. We’ll leverage the Databricks Marketplace to import sample data, use ai_query() to create a pipeline of reviews and serve that analysis in an AI/BI dashboard. The final result is an interactive way for users to understand customer opinions from product reviews.
Analyzing data from free-form text reviews can provide critical insights into customer feedback. What do customers think about your business? What particular aspects can be improved?
Traditionally, understanding large volumes of unstructured text data like user reviews involves a significant investment. You really need ML engineers to train and deploy classification and/or named entity recognition models that are purpose-built for each task. With AI Functions, however, Databricks is making it possible for anyone comfortable working in SQL to get answers to these questions. No bespoke modeling is required. It’s as simple as writing a few lines of SQL.

In this blog post, we’ll follow a step-by-step guide for a SQL analyst to understand trends in freeform text reviews. We’ll leverage the Databricks Marketplace to import sample data, then use ai_query() to create a pipeline of reviews. We’ll then display and share that analysis using an AI/BI dashboard. The final result is an interactive way for users to understand customer opinions from product reviews.
Let’s walk through the process of mining opinions in Databricks SQL.
For this example, we’ll use a sample Amazon reviews dataset from BrightData, one of the partners on the Databricks Marketplace. To access this data, navigate to the Databricks Marketplace via the left-hand navigation menu.

Databricks Marketplace is an open marketplace for all your data, analytics and AI, powered by the open source Delta Sharing standard. The Databricks Marketplace expands your opportunity to deliver innovation and advance all your analytics and AI initiatives. You can leverage these assets directly within your Databricks environment.
To find the dataset, go to the 'Search for products' search box and type in 'amazon - best seller products'. Click the first tile to see details about the dataset and provider.

From there, click the blue 'Get instant access’ button on the top right corner and accept the terms and conditions.

Instant access will add a shared dataset to Unity Catalog. Within Unity Catalog, click the 'amazon_reviews’ table to see some example reviews.

Databricks AI Functions provide an easy way to perform various analytical tasks that would be impossible using traditional SQL. The below options are currently supported with out-of-the-box functionality:
These functions are helpful in quickly testing out AI capabilities on top of text data. For instance, we can call ai_analyze_sentiment to find the review sentiment.
This will assign a positive, negative, neutral, or mixed sentiment to the text. For example, here is a review that was classified as positive:
For more customized use cases, Databricks recommends using ai_query(). ai_query() will allow you to specify your own model, prompt, and structured output response. It enables high throughput performance with no need for complex configuration. Let's use ai_query(). First, we'll take a look at sample data below.
Next, we should decide what we want in the output. Ideally, we could extract the critical opinions, record the sentiment, and classify the mention into a relevant category. The response might look like this:
To produce this output, we’ll construct a SQL query that uses ai_query() to call an LLM. We’ll create a custom prompt that instructs the LLM to classify reviews into our desired categories and provide a sample review and output to improve the quality of the response. We'll also specify a responseFormat, which defines a structured output that we want the result returned in.
Here's the query:
Copy and paste this into the SQL editor and run the query to see the results.

Now that we know this works, we can make a few improvements.
To implement these improvements, copy and paste the following code into a new query.