Deploy Your LLM Chatbot With Retrieval Augmented Generation (RAG), DBRX Instruct Foundation Models and Vector Search

What you’ll learn

LLMs are disrupting the way we interact with information, from internal knowledge bases to external, customer-facing documentation or support.

Learn how to create and deploy a real-time Q&A chatbot using Databricks retrieval augmented generation (RAG) and serverless capabilities, leveraging the DBRX Instruct Foundation Model for smart responses.

RAG is a powerful technique where we enrich the LLM prompt with additional context specific to your domain so that the model can provide better answers.

This technique provides excellent results using public models without having to deploy and fine-tune your own LLMs.

You will learn how to:

  • Prepare clean documents to build your internal knowledge base and specialize your chatbot
  • Leverage Databricks Vector Search with our Foundation Model endpoint to create and store document embeddings
  • Search similar documents from our knowledge database with Databricks Vector Search
  • Deploy a real-time model using RAG and providing augmented context in the prompt
  • Leverage the DBRX instruct model through with Databricks Foundation Model endpoint (fully managed)


To run the demo, get a free Databricks workspace and execute the following two commands in a Python notebook:

%pip install dbdemos
import dbdemos
dbdemos.install('llm-rag-chatbot', catalog='main', schema='rag_chatbot')


data science

On-Demand Video

Lakehouse Monitoring and Vector Search

demo thumb


Feature Store and Online Inference

demo thumb


AI Functions: Query LLMs With DB SQL

Disclaimer: This tutorial leverages features that are currently in private preview. Databricks Private Preview terms apply.
For more details, open the introduction notebook.