Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions docs/docs/api/appkit/Enumeration.ResourceType.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,14 @@ JOB: "job";

***

### POSTGRES

```ts
POSTGRES: "postgres";
```

***

### SECRET

```ts
Expand Down
54 changes: 52 additions & 2 deletions docs/docs/api/appkit/Interface.ResourceFieldEntry.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,16 @@ Single-value types use one key (e.g. id); multi-value types (database, secret) u

## Properties

### bundleIgnore?

```ts
optional bundleIgnore: boolean;
```

When true, this field is excluded from Databricks bundle configuration (databricks.yml) generation.

***

### description?

```ts
Expand All @@ -15,10 +25,50 @@ Human-readable description for this field

***

### env
### env?

```ts
env: string;
optional env: string;
```

Environment variable name for this field

***

### examples?

```ts
optional examples: string[];
```

Example values showing the expected format for this field

***

### localOnly?

```ts
optional localOnly: boolean;
```

When true, this field is only generated for local .env files. The Databricks Apps platform auto-injects it at deploy time.

***

### resolve?

```ts
optional resolve: string;
```

Named resolver prefixed by resource type (e.g., 'postgres:host'). The CLI resolves this value during the init prompt flow.

***

### value?

```ts
optional value: string;
```

Static value for this field. Used when no prompted or resolved value exists.
1 change: 1 addition & 0 deletions docs/docs/api/appkit/TypeAlias.ResourcePermission.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ type ResourcePermission =
| UcFunctionPermission
| UcConnectionPermission
| DatabasePermission
| PostgresPermission
| GenieSpacePermission
| ExperimentPermission
| AppPermission;
Expand Down
Binary file removed docs/docs/plugins/assets/lakebase-setup/step-1.png
Binary file not shown.
Binary file removed docs/docs/plugins/assets/lakebase-setup/step-2.png
Binary file not shown.
Binary file removed docs/docs/plugins/assets/lakebase-setup/step-4.png
Binary file not shown.
Binary file removed docs/docs/plugins/assets/lakebase-setup/step-5.png
Binary file not shown.
Binary file removed docs/docs/plugins/assets/lakebase-setup/step-6.png
Binary file not shown.
228 changes: 119 additions & 109 deletions docs/docs/plugins/lakebase.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,98 +4,30 @@ sidebar_position: 4

# Lakebase plugin

:::info
Currently, the Lakebase plugin currently requires a one-time manual setup to connect your Databricks App with your Lakebase database. An automated setup process is planned for an upcoming future release.
:::

Provides a PostgreSQL connection pool for Databricks Lakebase Autoscaling with automatic OAuth token refresh.

**Key features:**
- Standard `pg.Pool` compatible with any PostgreSQL library or ORM
- Automatic OAuth token refresh (1-hour tokens, 2-minute refresh buffer)
- Token caching to minimize API calls
- Built-in OpenTelemetry instrumentation (query duration, pool connections, token refresh)
- AppKit logger configured by default for query and connection events

## Setting up Lakebase

Before using the plugin, you need to connect your Databricks App's service principal to your Lakebase database.

### 1. Find your app's service principal

Create a Databricks App from the UI (`Compute > Apps > Create App > Create a custom app`). Navigate to the **Environment** tab and note the `DATABRICKS_CLIENT_ID` value — this is the service principal that will connect to your Lakebase database.

![App environment tab](./assets/lakebase-setup/step-1.png)

### 2. Find your Project ID and Branch ID
## Getting started with the Lakebase

Create a new Lakebase Postgres Autoscaling project. Navigate to your Lakebase project's branch details and switch to the **Compute** tab. Note the **Project ID** and **Branch ID** from the URL.
The easiest way to get started with the Lakebase plugin is to use the Databricks CLI to create a new Databricks app with AppKit installed and the Lakebase plugin.

![Branch details](./assets/lakebase-setup/step-2.png)
### Prerequisites

### 3. Find your endpoint

Use the Databricks CLI to list endpoints for the branch. Note the `name` field from the output — this is your `LAKEBASE_ENDPOINT` value.

```bash
databricks postgres list-endpoints projects/{project-id}/branches/{branch-id}
```

Example output:

```json
[
{
"create_time": "2026-02-19T12:13:02Z",
"name": "projects/{project-id}/branches/{branch-id}/endpoints/primary"
}
]
```
- [Node.js](https://nodejs.org) v22+ environment with `npm`
- Databricks CLI (v0.287.0 or higher): install and configure it according to the [official tutorial](https://docs.databricks.com/aws/en/dev-tools/cli/tutorial).
- A new Databricks app with AppKit installed. See [Bootstrap a new Databricks app](../index.md#quick-start-options) for more details.

### 4. Get connection parameters
### Steps

Click the **Connect** button on your Lakebase branch and copy the `PGHOST` and `PGDATABASE` values for later.

![Connect dialog](./assets/lakebase-setup/step-4.png)

### 5. Grant access to the service principal

Navigate to the **SQL Editor** tab on your Lakebase branch. Run the following SQL against the `databricks_postgres` database, replacing the service principal ID in the `DECLARE` block with the `DATABRICKS_CLIENT_ID` value from step 1:

```sql
CREATE EXTENSION IF NOT EXISTS databricks_auth;

DO $$
DECLARE
sp TEXT := 'your-service-principal-id'; -- Replace with DATABRICKS_CLIENT_ID from Step 1
BEGIN
-- Create service principal role
PERFORM databricks_create_role(sp, 'SERVICE_PRINCIPAL');

-- Connection and schema access
EXECUTE format('GRANT CONNECT ON DATABASE "databricks_postgres" TO %I', sp);
EXECUTE format('GRANT ALL ON SCHEMA public TO %I', sp);

-- Privileges on existing objects
EXECUTE format('GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO %I', sp);
EXECUTE format('GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA public TO %I', sp);
EXECUTE format('GRANT ALL PRIVILEGES ON ALL FUNCTIONS IN SCHEMA public TO %I', sp);
EXECUTE format('GRANT ALL PRIVILEGES ON ALL PROCEDURES IN SCHEMA public TO %I', sp);

-- Default privileges on future objects you create
EXECUTE format('ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON TABLES TO %I', sp);
EXECUTE format('ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON SEQUENCES TO %I', sp);
EXECUTE format('ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON FUNCTIONS TO %I', sp);
EXECUTE format('ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON ROUTINES TO %I', sp);
END $$;
```

![SQL Editor](./assets/lakebase-setup/step-5.png)

### 6. Verify the role

Navigate to the **Roles & Databases** tab and confirm the role is visible. You may need to fully refresh the page.

![Roles & Databases tab](./assets/lakebase-setup/step-6.png)
1. Firstly, create a new Lakebase Postgres Autoscaling project according to the [Get started documentation](https://docs.databricks.com/aws/en/oltp/projects/get-started).
1. To add the Lakebase plugin to your project, run the `databricks apps init` command and interactively select the **Lakebase** plugin. The CLI will guide you through picking a Lakebase project, branch, and database.
- When asked, select **Yes** to deploy the app to Databricks Apps right after its creation.

## Basic usage

Expand All @@ -107,33 +39,6 @@ await createApp({
});
```

## Environment variables

The required environment variables:

| Variable | Description |
|---|---|
| `PGHOST` | Lakebase host |
| `PGDATABASE` | Database name |
| `LAKEBASE_ENDPOINT` | Endpoint resource path (e.g. `projects/.../branches/.../endpoints/...`) |
| `PGSSLMODE` | TLS mode — set to `require` |

Ensure that those environment variables are set both for local development (`.env` file) and for deployment (`app.yaml` file):

```yaml
env:
- name: LAKEBASE_ENDPOINT
value: projects/{project-id}/branches/{branch-id}/endpoints/primary
- name: PGHOST
value: {your-lakebase-host}
- name: PGDATABASE
value: databricks_postgres
- name: PGSSLMODE
value: require
```

For the full configuration reference (SSL, pool size, timeouts, logging, ORM examples), see the [`@databricks/lakebase` README](https://github.com/databricks/appkit/blob/main/packages/lakebase/README.md).

## Accessing the pool

After initialization, access Lakebase through the `AppKit.lakebase` object:
Expand All @@ -143,9 +48,17 @@ const AppKit = await createApp({
plugins: [server(), lakebase()],
});

// Direct query (parameterized)
await AppKit.lakebase.query(`CREATE SCHEMA IF NOT EXISTS app`);

await AppKit.lakebase.query(`CREATE TABLE IF NOT EXISTS app.orders (
id SERIAL PRIMARY KEY,
user_id VARCHAR(255) NOT NULL,
amount DECIMAL(10, 2) NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
)`);

const result = await AppKit.lakebase.query(
"SELECT * FROM orders WHERE user_id = $1",
"SELECT * FROM app.orders WHERE user_id = $1",
[userId],
);

Expand All @@ -157,7 +70,32 @@ const ormConfig = AppKit.lakebase.getOrmConfig(); // { host, port, database, ..
const pgConfig = AppKit.lakebase.getPgConfig(); // pg.PoolConfig
```

## Configuration options
## Configuration

### Environment variables

The required environment variables are:

| Variable | Description |
|---|---|
| `LAKEBASE_ENDPOINT` | Endpoint resource path (e.g. `projects/.../branches/.../endpoints/...`) |
| `PGHOST` | Lakebase host (auto-injected in production by the `postgres` Databricks Apps resource) |
| `PGDATABASE` | Database name (auto-injected in production by the `postgres` Databricks Apps resource) |
| `PGSSLMODE` | TLS mode - set to `require` (auto-injected in production by the `postgres` Databricks Apps resource) |

When deployed to Databricks Apps with a `postgres` database resource configured, `PGHOST`, `PGDATABASE`, `PGSSLMODE`, `PGUSER`, `PGPORT`, and `PGAPPNAME` are automatically injected by the platform. Only `LAKEBASE_ENDPOINT` must be set explicitly:

```yaml
env:
- name: LAKEBASE_ENDPOINT
valueFrom: postgres
```

For local development, the `.env` file is automatically generated by `databricks apps init` with the correct values for your Lakebase project.

For the full configuration reference (SSL, pool size, timeouts, logging, ORM examples), see the [`@databricks/lakebase` README](https://github.com/databricks/appkit/blob/main/packages/lakebase/README.md).

### Pool configuration

Pass a `pool` object to override any defaults:

Expand All @@ -174,3 +112,75 @@ await createApp({
],
});
```

## Database Permissions

When you create the app with the Lakebase resource using the [Getting started](#getting-started-with-the-lakebase) guide, the Service Principal is automatically granted `CONNECT_AND_CREATE` permission on the `postgres` resource. This lets the Service Principal connect to the database and create new objects, but **not access any existing schemas or tables.**

### Local development

To develop locally against a deployed Lakebase database:

1. **Deploy the app first.** The Service Principal creates the database schema and tables on first deploy. Apps generated from `databricks apps init` handle this automatically - they check if tables exist on startup and skip creation if they do.

2. **Grant `databricks_superuser` via the Lakebase UI:**
1. Open the Lakebase Autoscaling UI and navigate to your project's **Branch Overview** page.
2. Click **Add role** (or **Edit role** if your OAuth role already exists).
3. Select your Databricks identity as the principal and check the **`databricks_superuser`** system role.

3. **Run locally** - your Databricks user identity (email) is used for OAuth authentication. The `databricks_superuser` role gives full **DML access** (read/write data) but **not DDL** (creating schemas or tables) - that's why deploying first matters (see note below).

For other users, use the same **Add role** flow in the Lakebase UI to create an OAuth role with `databricks_superuser` for each user.

:::tip
[Postgres password authentication](https://docs.databricks.com/aws/en/oltp/projects/authentication#overview) is a simpler alternative that avoids OAuth role permission complexity. However, it requires you to set up a password for the user in the **Branch Overview** page in the Lakebase Autoscaling UI.
:::

:::info[Why deploy first?]
When the app is deployed, the Service Principal creates schemas and tables and becomes their owner. A `databricks_superuser` has full **DML access** (SELECT, INSERT, UPDATE, DELETE) to these objects, but **cannot run DDL** (CREATE SCHEMA, CREATE TABLE) on schemas owned by the Service Principal. Deploying first ensures all objects exist before local development begins.
:::

### Fine-grained permissions

For most use cases, `databricks_superuser` is sufficient. If you need schema-level grants instead, refer to the official documentation:

- [Manage database permissions](https://docs.databricks.com/aws/en/oltp/projects/manage-roles-permissions)
- [Postgres roles](https://docs.databricks.com/aws/en/oltp/projects/postgres-roles)

<details>
<summary>SQL script for fine-grained grants</summary>

Deploy and run the app at least once before executing these grants so the Service Principal initializes the database schema first.

Replace `subject` with the user email and `schema` with your schema name:

```sql
CREATE EXTENSION IF NOT EXISTS databricks_auth;

DO $$
DECLARE
subject TEXT := 'your-subject'; -- User email like name@databricks.com
schema TEXT := 'your_schema'; -- Replace 'your_schema' with your schema name
BEGIN
-- Create OAuth role for the Databricks identity
PERFORM databricks_create_role(subject, 'USER');

-- Connection and schema access
EXECUTE format('GRANT CONNECT ON DATABASE "databricks_postgres" TO %I', subject);
EXECUTE format('GRANT ALL ON SCHEMA %s TO %I', schema, subject);

-- Privileges on existing objects
EXECUTE format('GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA %s TO %I', schema, subject);
EXECUTE format('GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA %s TO %I', schema, subject);
EXECUTE format('GRANT ALL PRIVILEGES ON ALL FUNCTIONS IN SCHEMA %s TO %I', schema, subject);
EXECUTE format('GRANT ALL PRIVILEGES ON ALL PROCEDURES IN SCHEMA %s TO %I', schema, subject);

-- Default privileges on future objects
EXECUTE format('ALTER DEFAULT PRIVILEGES IN SCHEMA %s GRANT ALL ON TABLES TO %I', schema, subject);
EXECUTE format('ALTER DEFAULT PRIVILEGES IN SCHEMA %s GRANT ALL ON SEQUENCES TO %I', schema, subject);
EXECUTE format('ALTER DEFAULT PRIVILEGES IN SCHEMA %s GRANT ALL ON FUNCTIONS TO %I', schema, subject);
EXECUTE format('ALTER DEFAULT PRIVILEGES IN SCHEMA %s GRANT ALL ON ROUTINES TO %I', schema, subject);
END $$;
```

</details>
Loading
Loading