---
sidebar_position: 4
---

# Lakebase plugin

Provides a PostgreSQL connection pool for Databricks Lakebase Autoscaling with automatic OAuth token refresh.

**Key features:**
- Standard `pg.Pool` compatible with any PostgreSQL library or ORM
- Automatic OAuth token refresh (1-hour tokens, 2-minute refresh buffer)
- Token caching to minimize API calls
- Built-in OpenTelemetry instrumentation (query duration, pool connections, token refresh)
- AppKit logger configured by default for query and connection events

## Getting started with the Lakebase

The easiest way to get started with the Lakebase plugin is to use the Databricks CLI to create a new Databricks app with AppKit installed and the Lakebase plugin.

### Prerequisites

- [Node.js](https://nodejs.org) v22+ environment with `npm`
- Databricks CLI (v0.295.0 or higher): install and configure it according to the [official tutorial](https://docs.databricks.com/aws/en/dev-tools/cli/tutorial).
- A new Databricks app with AppKit installed. See [Bootstrap a new Databricks app](../index.md#quick-start-options) for more details.

### Steps

1. Firstly, create a new Lakebase Postgres Autoscaling project according to the [Get started documentation](https://docs.databricks.com/aws/en/oltp/projects/get-started).
1. To add the Lakebase plugin to your project, run the `databricks apps init` command and interactively select the **Lakebase** plugin. The CLI will guide you through picking a Lakebase project, branch, and database.
    - When asked, select **Yes** to deploy the app to Databricks Apps right after its creation.

## Basic usage

```ts
import { createApp, lakebase, server } from "@databricks/appkit";

await createApp({
  plugins: [server(), lakebase()],
});
```

## Accessing the pool

After initialization, access Lakebase through the `AppKit.lakebase` object:

```ts
const AppKit = await createApp({
  plugins: [server(), lakebase()],
});

await AppKit.lakebase.query(`CREATE SCHEMA IF NOT EXISTS app`);

await AppKit.lakebase.query(`CREATE TABLE IF NOT EXISTS app.orders (
  id SERIAL PRIMARY KEY,
  user_id VARCHAR(255) NOT NULL,
  amount DECIMAL(10, 2) NOT NULL,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
)`);

const result = await AppKit.lakebase.query(
  "SELECT * FROM app.orders WHERE user_id = $1",
  [userId],
);

// Raw pg.Pool (for ORMs or advanced usage)
const pool = AppKit.lakebase.pool;

// ORM-ready config objects
const ormConfig = AppKit.lakebase.getOrmConfig();  // { host, port, database, ... }
const pgConfig = AppKit.lakebase.getPgConfig();    // pg.PoolConfig
```

## Configuration

### Environment variables

The required environment variables are:

| Variable | Description |
|---|---|
| `LAKEBASE_ENDPOINT` | Endpoint resource path (e.g. `projects/.../branches/.../endpoints/...`) |
| `PGHOST` | Lakebase host (auto-injected in production by the `postgres` Databricks Apps resource) |
| `PGDATABASE` | Database name (auto-injected in production by the `postgres` Databricks Apps resource) |
| `PGSSLMODE` | TLS mode - set to `require` (auto-injected in production by the `postgres` Databricks Apps resource) |

When deployed to Databricks Apps with a `postgres` database resource configured, `PGHOST`, `PGDATABASE`, `PGSSLMODE`, `PGUSER`, `PGPORT`, and `PGAPPNAME` are automatically injected by the platform. Only `LAKEBASE_ENDPOINT` must be set explicitly:

```yaml
env:
  - name: LAKEBASE_ENDPOINT
    valueFrom: postgres
```

For local development, the `.env` file is automatically generated by `databricks apps init` with the correct values for your Lakebase project.

For the full configuration reference (SSL, pool size, timeouts, logging, ORM examples), see the [`@databricks/lakebase` README](https://github.com/databricks/appkit/blob/main/packages/lakebase/README.md).

### Pool configuration

Pass a `pool` object to override any defaults:

```ts
await createApp({
  plugins: [
    lakebase({
      pool: {
        max: 10,                      // Max pool connections (default: 10)
        connectionTimeoutMillis: 5000, // Connection timeout ms (default: 10000)
        idleTimeoutMillis: 30000,      // Idle connection timeout ms (default: 30000)
      },
    }),
  ],
});
```

## Database Permissions

When you create the app with the Lakebase resource using the [Getting started](#getting-started-with-the-lakebase) guide, the Service Principal is automatically granted `CONNECT_AND_CREATE` permission on the `postgres` resource. This lets the Service Principal connect to the database and create new objects, but **not access any existing schemas or tables.**

### Local development

To develop locally against a deployed Lakebase database:

1. **Deploy the app first.** The Service Principal creates the database schema and tables on first deploy. Apps generated from `databricks apps init` handle this automatically - they check if tables exist on startup and skip creation if they do.

2. **Grant `databricks_superuser` via the Lakebase UI:**
   1. Open the Lakebase Autoscaling UI and navigate to your project's **Branch Overview** page.
   2. Click **Add role** (or **Edit role** if your OAuth role already exists).
   3. Select your Databricks identity as the principal and check the **`databricks_superuser`** system role.

3. **Run locally** - your Databricks user identity (email) is used for OAuth authentication. The `databricks_superuser` role gives full **DML access** (read/write data) but **not DDL** (creating schemas or tables) - that's why deploying first matters (see note below).

For other users, use the same **Add role** flow in the Lakebase UI to create an OAuth role with `databricks_superuser` for each user.

:::tip
[Postgres password authentication](https://docs.databricks.com/aws/en/oltp/projects/authentication#overview) is a simpler alternative that avoids OAuth role permission complexity. However, it requires you to set up a password for the user in the **Branch Overview** page in the Lakebase Autoscaling UI.
:::

:::info[Why deploy first?]
When the app is deployed, the Service Principal creates schemas and tables and becomes their owner. A `databricks_superuser` has full **DML access** (SELECT, INSERT, UPDATE, DELETE) to these objects, but **cannot run DDL** (CREATE SCHEMA, CREATE TABLE) on schemas owned by the Service Principal. Deploying first ensures all objects exist before local development begins.
:::

### Fine-grained permissions

For most use cases, `databricks_superuser` is sufficient. If you need schema-level grants instead, refer to the official documentation:

- [Manage database permissions](https://docs.databricks.com/aws/en/oltp/projects/manage-roles-permissions)
- [Postgres roles](https://docs.databricks.com/aws/en/oltp/projects/postgres-roles)

<details>
<summary>SQL script for fine-grained grants</summary>

Deploy and run the app at least once before executing these grants so the Service Principal initializes the database schema first.

Replace `subject` with the user email and `schema` with your schema name:

```sql
CREATE EXTENSION IF NOT EXISTS databricks_auth;

DO $$
DECLARE
  subject TEXT := 'your-subject';  -- User email like name@databricks.com
  schema TEXT := 'your_schema'; -- Replace 'your_schema' with your schema name
BEGIN
  -- Create OAuth role for the Databricks identity
  PERFORM databricks_create_role(subject, 'USER');

  -- Connection and schema access
  EXECUTE format('GRANT CONNECT ON DATABASE "databricks_postgres" TO %I', subject);
  EXECUTE format('GRANT ALL ON SCHEMA %s TO %I', schema, subject);

  -- Privileges on existing objects
  EXECUTE format('GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA %s TO %I', schema, subject);
  EXECUTE format('GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA %s TO %I', schema, subject);
  EXECUTE format('GRANT ALL PRIVILEGES ON ALL FUNCTIONS IN SCHEMA %s TO %I', schema, subject);
  EXECUTE format('GRANT ALL PRIVILEGES ON ALL PROCEDURES IN SCHEMA %s TO %I', schema, subject);

  -- Default privileges on future objects
  EXECUTE format('ALTER DEFAULT PRIVILEGES IN SCHEMA %s GRANT ALL ON TABLES TO %I', schema, subject);
  EXECUTE format('ALTER DEFAULT PRIVILEGES IN SCHEMA %s GRANT ALL ON SEQUENCES TO %I', schema, subject);
  EXECUTE format('ALTER DEFAULT PRIVILEGES IN SCHEMA %s GRANT ALL ON FUNCTIONS TO %I', schema, subject);
  EXECUTE format('ALTER DEFAULT PRIVILEGES IN SCHEMA %s GRANT ALL ON ROUTINES TO %I', schema, subject);
END $$;
```

</details>
