SchemaBrain — Setup
The recommended path from “I have a Postgres database” to “an AI agent can answer questions about it” is the activation wizard (schemabrain init). It runs the source check, indexer, entity suggestion, and host wiring in one command.
Two alternative install paths are documented separately:
- Docker — published image with the embedding model baked in (no first-run download).
- Manual flow — explicit
indexinvocation, logs config, troubleshooter, MCP Inspector, and an SQL-validation ladder.
1. Install + run the wizard
- Option 2 (default — press Enter) — spins up a local demo Postgres container (
sb-demo-pgon port 5433) with the bundled SaaS fixture. Idempotent on re-runs. Requires Docker. - Option 1 — prompts for a
postgresql+psycopg://user:pass@host:5432/dbnameconnection string (password-masked paste).
ANTHROPIC_API_KEY required.
On your own database, exporting ANTHROPIC_API_KEY unlocks the entity + metric suggestion stages; without it the wizard soft-skips them gracefully — you can run schemabrain entities suggest --apply later once you have a key.
After URL resolution, the wizard shows a host-selection menu listing all 4 supported MCP hosts (Claude Desktop, Claude Code, Cursor, Windsurf) plus a Manual paste option. Detected installs get a ✓ chip; the default cursor points at the priority winner (Claude Desktop → Claude Code → Cursor → Windsurf). Press Enter to take the detected default, or type a number to pick something else. Pass --host X explicitly to bypass the prompt entirely; under --yes / non-TTY the wizard collapses silently to the priority-winner detection with a one-line stderr note. (For ChatGPT see the Codex CLI page — there’s no first-party --host chatgpt flag yet.)
schemabrain dashboard (needs the [ui] extra — pip install 'schemabrain[ui]'). Restarting the host is the second step, so the agent picks up the new tools.
Skip the prompts with a .env file
If you already have credentials in hand, drop them into a .env file in your working directory before running init. Create or edit .env with your editor of choice — the wizard reads these two keys:
.env from cwd before every subcommand. Shell exports always win — .env only fills gaps. If you paste your ANTHROPIC_API_KEY during the wizard instead, it offers to save the key back to .env for next time (opt-in, with a gitignore safety check).
Working in the source repo? Run
cp .env.example .env once to seed the file with placeholder rows + comments, then fill in the values. The CLI also reads SCHEMABRAIN_LOG_LEVEL for log verbosity (DEBUG / INFO / WARNING / ERROR). Do not use cat > .env after copying — it would wipe the template.Install from source (alternative)
2. What the wizard does
Seven stages, ~45 seconds end-to-end on a warm cache:- Source check — validates the URL is reachable + read-only on Postgres. Auto-detects a dbt manifest from
$DBT_PROJECT_DIR/target/manifest.jsonor by walking up from cwd for adbt_project.yml. When found, stages 3 + 4 route through the dbt importer instead of the LLM. Force a manifest with--from-dbt PATH. - Index schema — DDL introspection into
./schemabrain.db. Cost-free by default;--enrichopts in to LLM column descriptions (2.00 for a 50-table schema). - Curate entities — Claude Sonnet 4.6 proposes domain entities (or the dbt manifest is the source of truth when detected). Soft-skips if
ANTHROPIC_API_KEYis absent. Cap spend with--entities-max-cost-usd N. Opt out with--no-entities. - Curate metrics — Claude Sonnet 4.6 proposes aggregations anchored on the curated entities (or dbt metrics are imported). Cap spend with
--metrics-max-cost-usd N. Opt out with--no-metrics. - Curate joins — mines FK constraints +
pg_stat_statementsquery log to surface canonical joins. Deterministic — no LLM call, no cost cap. Opt out with--no-joins. - Wire host — writes
schemabraininto your MCP host’s config. Use--host claude-code/--host cursor/--host windsurf, or--host manual(alias--print-only) for any other host — paste-ready snippet to stdout. - Next step — prints what to ask the agent first.
--skip-llm-confirm; the full superset --yes skips both the LLM pause and the host-overwrite prompt. The pause auto-suppresses in non-TTY environments (CI, pytest).
Re-runs are idempotent: every stage auto-skips when its work is already done. Use --skip-index to opt out of stage 2, --no-entities / --no-metrics / --no-joins to opt out of stages 3 / 4 / 5 individually.
3. Open the dashboard
http://localhost:7878 showing PII categorization, refusal audit, and the audit chain status. See the Dashboard overview.
4. Wire your host
Per-host setup pages (~60-second wiring each):| Host | Platform notes |
|---|---|
| Claude Desktop | macOS + Windows |
| Claude Code | CLI / all platforms |
| Cursor | Cursor IDE |
| Windsurf | Windsurf / Cascade |
| ChatGPT | Codex CLI (stdio) |
What’s next
- First 5 queries — exercises each load-bearing firewall property in ~10 minutes.
schemabrain eval— score retrieval quality against the bundled SaaS golden set (or your own).- Re-run
schemabrain indexwhenever your schema changes — it’s idempotent and cache-aware. - For schemas with cryptic column names (
acct_dim_v3,pmt_fct_h), pass--enable-sonnet— onschemabrain indexorschemabrain init— to route those columns to Claude Sonnet 4.6 for better decoding (~5x cost per affected column). Off by default (Haiku-only).