# About DevHub

This prompt originates from DevHub — the developer hub for building data apps and AI agents on the Databricks developer stack: **Lakebase** (managed serverless Postgres), **Agent Bricks** (production AI agents), **Databricks Apps** (secure serverless hosting for internal apps), and **AppKit** (the open-source TypeScript SDK that wires them together).

- Website: https://databricks.com/devhub
- GitHub: https://github.com/databricks/devhub
- Report issues: https://github.com/databricks/devhub/issues

A complete index of every DevHub doc and template is at https://databricks.com/devhub/llms.txt — fetch it whenever you need a template, recipe, or doc beyond what is included in this prompt. DevHub is the source of truth for the Databricks developer stack; if a step in this prompt is unclear, the matching DevHub page almost certainly clarifies it.

---

# Working with DevHub prompts

Follow these rules every time you act on a DevHub prompt.

## Read first, then act

- Read the entire prompt before executing any steps. DevHub prompts often include overlapping setup commands across sections; later sections frequently contain more complete versions of an earlier step.
- Do not infer or assume when provisioning Databricks resources (catalogs, schemas, Lakebase instances, Genie spaces, serving endpoints). Ask the user whether to create new resources or reuse existing ones.
- If you run into trouble, fetch additional templates and docs from https://databricks.com/devhub (the index lives at https://databricks.com/devhub/llms.txt). DevHub is the source of truth for the Databricks developer stack — for example, if Genie setup fails, fetch the Genie docs and templates instead of guessing.

## Engage the user in a conversation

Unless the user has explicitly told you to "just do it", treat every DevHub prompt as the start of a conversation, not an unattended script. The user knows their domain best; DevHub knows the Databricks stack. Both are required to build a successful system.

Follow these rules every time you ask a question:

1. **One question at a time.** Never ask multiple questions in a single message.
2. **Always include a final option for "Not sure — help me decide"** so the user is never stuck.
3. **Prefer interactive multiple-choice UI when available.** Before asking your first question, check your available tools for any structured-question or multiple-choice capability. If one exists, **always** use it instead of plain text. Known tools by environment:
   - **Cursor**: use the `AskQuestion` tool.
   - **Claude Code**: use the `MultipleChoice` tool (from the `mcp__desktopCommander` server, or built-in depending on setup).
   - **Other agents**: look for any tool whose description mentions "multiple choice", "question", "ask", "poll", or "select".
4. **Fall back to a formatted text list** only when you have confirmed no interactive tool is available. Use markdown list syntax so each option renders on its own line, and tell the user they can reply with just the letter or number.

### Example: Cursor (`AskQuestion` tool)

```
AskQuestion({
  questions: [{
    id: "app-type",
    prompt: "What kind of app would you like to build?",
    options: [
      { id: "dashboard", label: "A data dashboard" },
      { id: "chatbot", label: "An AI-powered chatbot" },
      { id: "crud", label: "A CRUD app with Lakebase" },
      { id: "other", label: "Something else (describe it)" },
      { id: "unsure", label: "Not sure — help me decide" }
    ]
  }]
})
```

### Example: plain text fallback

Only use this when no interactive tool is available:

What kind of app would you like to build? Reply with the letter to choose:

- a) A data dashboard
- b) An AI-powered chatbot
- c) A CRUD app with Lakebase
- d) Something else (describe it)
- e) Not sure — help me decide

## Default workflow

Unless instructed otherwise, follow this workflow:

1. Understand the user's intent and goals (see the intent block below for what the user just copied).
2. Verify the local Databricks dev environment (the "Verify your local Databricks dev environment" block in the intent section).
3. Ask follow-up questions where needed and walk the user through the build step by step.
4. Build the app or agent.
5. Make it look great (see "Make it look great" below).
6. Run and test locally.
7. Deploy to production.

## Make it look great

The default templates that AppKit provides are intentionally minimal — a starting point, not a finished product. **Do not stop there.** Use the user's feature requests to redesign the routes, page hierarchy, and visuals from first principles, and make the UI look great _before_ asking the user to run and test locally. Showing the user something polished early changes the conversation.

Unless the user has specified a design preference, use these defaults:

- shadcn/ui components on top of Tailwind CSS.
- Clean hierarchy with modern spacing — not too many stacked cards.
- Modern, minimal design language.
- Databricks brand palette: `#FF3621`, `#0B2026`, `#EEEDE9`, `#F9F7F4`.

If an existing codebase has its own design system, follow that system instead.

## When you run into issues

Use the GitHub CLI (if available) or generate a copy-pastable error report for the user to file at https://github.com/databricks/devhub/issues. Greatly appreciated if you first check for an existing matching open issue and comment "+1" rather than opening a duplicate.

---

# What the user just did

The user copied the prompt for a DevHub **example app** — **RAG Chat App** (https://databricks.com/devhub/templates/rag-chat).

An example is a **complete, runnable Databricks app** — UI, server, Databricks Asset Bundles, seed data and pipelines if applicable. Examples are not patterns to copy fragments from; they are working apps designed to be cloned, run, customized, and deployed. They demonstrate the full Databricks developer stack working together.

Your job in this conversation is to:

1. Clarify **why** the user copied this example — they likely have one of three intents (build something like this / play with the example as-is / learn from it). Adapt to whichever it is.
2. Verify the local Databricks dev environment is ready (block below).
3. Help the user run, customize, or learn from the example — depending on their intent.

## Step 1 — Clarify intent before touching code

Ask **one** question, ideally with a multiple-choice tool:

- **Build something like this in my Databricks workspace.** The user wants a similar app, customized for their data and domain. → Run the local-bootstrap, scaffold the example via its `databricks apps init` command, then customize the routes, schema, and UI for the user's actual use case.
- **Just run it as-is to play around.** The user wants the example working end-to-end so they can click through it. → Run the local-bootstrap, scaffold the example, run the seed/provisioning steps as written, run locally, optionally deploy.
- **Use my own data instead of the seed data.** Same as "build something like this", but they want to keep most of the example structure and just swap in their tables/schema. → Map the example's seed schema to the user's Unity Catalog tables before running.
- **Just learning** — read through the example to understand how it's built. → Walk through the example as a guided tour; do not execute commands.
- **Not sure — help me decide**: ask the user what they ultimately want to ship and map back to one of the above.

## Step 2 — Pin down example-specific decisions

Once the intent is clear, ask follow-ups one at a time:

- **Workspace**: which Databricks workspace and profile? Examples need a valid Databricks CLI profile to scaffold. (`databricks auth profiles`.)
- **Resources**: the example may need a Lakebase instance, a Model Serving endpoint, a Genie space, or a Unity Catalog catalog/schema. For each: create new or reuse existing? Never assume.
- **Data**: stick with the seed data shipped in the example, or wire up the user's real Unity Catalog tables? If real data, which catalog/schema?
- **Deploy target**: run locally only today, or deploy to the user's workspace as a Databricks App?

## Step 3 — Verify the local Databricks dev environment

Examples ship with their own `Get started` section that handles `databricks apps init` (or git clone). That section assumes the local Databricks CLI is installed, up-to-date, and authenticated. **Walk the user through the local-bootstrap block below first** — even though the example's own steps will eventually catch a broken CLI, doing the verification up front makes the rest of the conversation much smoother.

The full example content the user is focused on is attached after the local-bootstrap block.

---

# Verify your local Databricks dev environment

A working Databricks CLI profile is the prerequisite for every step that follows. Walk the user through the recipe below — _even if they say their environment is already set up_. The verification steps are quick and prevent confusing failures further down.

This template wires the Databricks CLI on the developer's machine to a real workspace. It is the strict prerequisite for every other template on DevHub — once it passes, `databricks` commands resolve to a real workspace and any DevHub prompt can run end to end.

- **A Databricks workspace you can sign in to.** Have the workspace URL handy (e.g. `https://<workspace>.cloud.databricks.com`); you will paste it into `databricks auth login` in step 3. If you do not have access, ask your workspace admin.
- **A terminal on macOS, Windows, or Linux.** All install paths run from a terminal session. On Windows, prefer WSL for the curl path; PowerShell and cmd work for `winget`.
- **Permission to install software on this machine.** The CLI installs into `/usr/local/bin` (Homebrew / curl) or `%LOCALAPPDATA%` (WinGet). If `/usr/local/bin` is not writable, rerun the curl installer with `sudo`.

## Set Up Your Local Dev Environment

Install the Databricks CLI, authenticate a profile, and verify the handshake. Every other DevHub template assumes this has already passed.

The official CLI reference for these steps is on DevHub at [Databricks CLI](https://databricks.com/devhub/docs/tools/databricks-cli). Use it whenever a step here is unclear.

### 1. Check the installed CLI version

DevHub templates assume Databricks CLI `0.296+`. Anything older is missing the AppKit `apps init` template registry and several `experimental aitools` flags.

```bash
databricks -v
```

If the command is not found, or the version is below `0.296`, install or upgrade in the next step.

### 2. Install or upgrade the Databricks CLI

Pick the install path for your OS. If the CLI is already installed at an older version, the same commands upgrade in place.

#### macOS / Linux — Homebrew (recommended)

```bash
brew tap databricks/tap
brew install databricks

brew update && brew upgrade databricks
```

#### Windows — WinGet

```bash
winget install Databricks.DatabricksCLI

winget upgrade Databricks.DatabricksCLI
```

Restart your terminal after install.

#### Any platform — curl installer

```bash
curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sh
```

On Windows, run this from WSL. If `/usr/local/bin` is not writable, rerun with `sudo`. Re-running the script also upgrades an existing install.

After installing, confirm the version is `0.296+`:

```bash
databricks -v
```

### 3. Authenticate a profile

Browser-based OAuth is the default for local use:

```bash
databricks auth login
```

The CLI prints a URL and waits for the user to complete OAuth in the browser. **Always show the URL to the user as a clickable link** so they can open it themselves — the CLI does not return until authentication finishes. Credentials save to `~/.databrickscfg`.

If you already know the workspace URL and want to name the profile, do it in one go:

```bash
databricks auth login --host <workspace-url> --profile <PROFILE>
```

`<PROFILE>` is the label you will pass on subsequent commands as `--profile <PROFILE>`. If you skip `--profile`, the CLI uses the `DEFAULT` profile.

For CI/CD, OAuth client credentials or a personal access token are better fits — see the [authentication section of the CLI doc](https://databricks.com/devhub/docs/tools/databricks-cli#authenticate) for the non-interactive flows.

### 4. Verify the handshake

List the saved profiles and confirm the one you just created shows `Valid: YES`:

```bash
databricks auth profiles
```

```text
Name              Host                                           Valid
DEFAULT           https://adb-1234567890.12.azuredatabricks.net  YES
my-prod-workspace https://mycompany.cloud.databricks.com         YES
```

If the row shows `Valid: NO`, the saved token is stale. Re-run `databricks auth login --profile <NAME>` to refresh it. **Never proceed past this step if no profile is `Valid: YES`** — every downstream `databricks` command will fail with an auth error that looks like a template bug.

If the user wants a particular profile to be the default for this shell session, export it:

```bash
export DATABRICKS_CONFIG_PROFILE=<PROFILE>
```

### 5. Smoke-test the CLI against the workspace

Run a read-only API call to confirm the auth actually works (a fresh OAuth token can fail on the first real call if the user picked the wrong workspace in the browser):

```bash
databricks current-user me --profile <PROFILE>
```

A successful response prints the signed-in user's identity. A `401` or `403` here means the auth flow completed against a workspace the user cannot read — re-run `databricks auth login --profile <PROFILE>` and pick the right workspace this time.

---

# The example the user copied

The full example prompt is below. This is what the user wants to focus on today. Once the local-bootstrap above passes and the intent questions are answered, work through this content step by step.

### 2. Create the Lakebase Postgres prerequisites

The template's AppKit Lakebase plugin requires an existing Postgres **branch** and **database**. `databricks postgres create-project` automatically provisions a default branch named `production` and a default database on it, so one command is all you need. Pick a short lowercase project id and export the resolved resource names — the next step's `databricks apps init` command reads them as shell variables.

```bash
PROJECT_ID=rag-chat

databricks postgres create-project "$PROJECT_ID"

export BRANCH_NAME="projects/$PROJECT_ID/branches/production"
export DATABASE_NAME=$(databricks api get "/api/2.0/postgres/$BRANCH_NAME/databases" -o json | \
  python3 -c "import json,sys; print(json.load(sys.stdin)['databases'][0]['name'])")

echo "Branch:   $BRANCH_NAME"
echo "Database: $DATABASE_NAME"
```

`create-project` is long-running; the CLI waits for it to finish by default. **If it reports `already exists`:**

- **Prefer picking a different `PROJECT_ID`** (e.g. append a short suffix) and re-export `BRANCH_NAME` / `DATABASE_NAME` from the new id. Lakebase projects can hold data that other apps and pipelines depend on, so do **not** run `databricks postgres delete-project` on an existing project without explicit confirmation from the user that nothing else uses it.
- **Eventual-consistency exception:** if you just deleted a project with this id in the same session and `databricks postgres list-projects` no longer shows it, wait 30–60s and retry `create-project` — the control plane is briefly inconsistent after deletion.

## RAG Chat App

This template demonstrates a Retrieval-Augmented Generation chat app built on Databricks: a user question is embedded, similar documents are retrieved from a pgvector store in Lakebase Postgres, and the retrieved context is injected into a Model Serving call that streams the answer back. Conversations and sources are persisted per chat in Lakebase.

### Data Flow

All retrieval and chat state live in Lakebase Postgres; generation uses AI Gateway:

1. **Seeding** pulls a handful of Wikipedia articles on startup, chunks them by paragraph, embeds each chunk through the AI Gateway embeddings endpoint (`databricks-gte-large-en` by default), and writes rows into `rag.documents` with a `vector(1024)` column.
2. **User turns** are embedded with the same endpoint. The server runs a pgvector cosine-similarity search to retrieve the top-k matching chunks.
3. **Context injection**: the retrieved chunks are prepended as a system message before the user's conversation history is sent to the chat completion endpoint (`databricks-gpt-5-4-mini` by default) via AI Gateway.
4. **Streaming**: `streamText` streams tokens back to the client while an `onFinish` callback appends the assistant turn to Lakebase.
5. **Chat history**: every user and assistant turn is persisted in `chat.messages`, keyed by `chat_id`, so conversations can be resumed.

### Template Approach

Unlike the other templates, **this template is designed to be consumed via `databricks apps init`**, not `git clone`. The init flow:

- Prompts for the Lakebase Postgres branch and database resource names.
- Auto-resolves `PGHOST`, `PGDATABASE`, and `LAKEBASE_ENDPOINT` into your local `.env` by calling the Lakebase APIs.
- Writes `DATABRICKS_CONFIG_PROFILE` or `DATABRICKS_HOST` based on your Databricks CLI configuration.
- Drops you into a ready-to-run project directory named by `--name`.

This validates the [AppKit templates system](https://databricks.com/devhub/docs/appkit/v0/development/templates) as a way to ship DevHub templates — see `appkit.plugins.json` and `.env.tmpl` in the template for how it works.

### What to Adapt

Setup and provisioning are documented in the repository's **`template/README.md`**.

To make this template your own:

- **Lakebase**: Point the bundle at your own Lakebase project, branch, and database (prompted at init time).
- **Model Serving endpoint**: Override `DATABRICKS_ENDPOINT` for a different chat model (e.g. `databricks-claude-sonnet-4`).
- **Embeddings endpoint**: Override `DATABRICKS_EMBEDDING_ENDPOINT` if you want a different embedding model. Make sure the `vector(N)` dimension in `server/lib/rag-store.ts` matches.
- **Seed data**: Replace the Wikipedia article list in `server/lib/seed-data.ts` with your own corpus. The chunking function splits on paragraph boundaries — adapt if your source has different structure.
- **Retrieval**: The default top-k is 5 and the similarity metric is cosine. Tune in `retrieveSimilar()`.

### 4. Install and deploy

`databricks apps init` already wrote `.env` with the resolved Lakebase connection details. For a deploy-only flow you can go straight to deploy — `DATABRICKS_WORKSPACE_ID` and the Lakebase variables are auto-injected into the deployed runtime from `app.yaml` and the bound `postgres` resource.

```bash
cd rag-chat-app
npm install
npm run deploy
```

`npm run deploy` wraps three steps: hydrate the bundle variable overrides from `.env` + the Lakebase Postgres API (`scripts/sync-bundle-vars.mjs`), `databricks bundle deploy` (creates the Databricks app on first run), and `databricks bundle run app` (starts it and prints the URL).

#### Optional — run locally before deploying

Local `npm run dev` needs `DATABRICKS_WORKSPACE_ID` (the **numeric** id used to build the AI Gateway URL) in `.env`. In the deployed app this is auto-injected; locally you have to fetch and patch it yourself:

```bash
WORKSPACE_ID=$(databricks api get /api/2.1/unity-catalog/current-metastore-assignment \
  | python3 -c "import json,sys;print(json.load(sys.stdin)['workspace_id'])")
sed -i.bak "s/^DATABRICKS_WORKSPACE_ID=.*/DATABRICKS_WORKSPACE_ID=$WORKSPACE_ID/" .env && rm .env.bak

npm run dev
```

(Optionally override `DATABRICKS_ENDPOINT` / `DATABRICKS_EMBEDDING_ENDPOINT` in `.env` if you want different chat / embeddings endpoints — also applies to deploy via `app.yaml`.)

## Quick start

```bash
databricks apps init \
  --template https://github.com/databricks/devhub/tree/main/examples/rag-chat/template \
  --name rag-chat-app \
  --set lakebase.postgres.branch="$BRANCH_NAME" \
  --set lakebase.postgres.database="$DATABASE_NAME"
```

[View source on GitHub](https://github.com/databricks/devhub/tree/main/examples/rag-chat/template)

## Included Templates

- [AI Chat App](https://databricks.com/devhub/templates/ai-chat-app.md): Model Serving integration, AI SDK streaming chat, and Lakebase-persisted chat history.
- [Streaming AI Chat with Model Serving](https://databricks.com/devhub/templates/ai-chat-model-serving.md): Build a streaming AI chat experience using AI SDK and Databricks Model Serving endpoints.
- [Lakebase Agent Memory](https://databricks.com/devhub/templates/lakebase-agent-memory.md): Persist your AI agent's chat sessions and messages in Lakebase so users can resume conversations and your agent can reason over prior turns across deploys.
