Manual flow
The activation wizard is the recommended path for most users. This page is for operators who want explicit control over each step — or who are scripting individual phases.1. Index your schema
Legacy path: the olderThe index step:schemabrain index "postgresql+psycopg://..."form still works for backwards compatibility, but emits a deprecation warning when the URL contains a password. New scripts should use--url-env.
- Reflects every user-visible table.
- Generates one LLM-written column description per column (Claude Haiku 4.5; ~$0.0003/column at typical schema density) — only when run with enrichment enabled.
- Embeds each description locally with
BAAI/bge-small-en-v1.5(~67 MB ONNX, ~10 ms/column warm). - Persists everything to
./schemabrain.db.
index against an unchanged schema is a no-op — 0 LLM calls, 0 embedder calls, ~0.1 s. Schema-changed tables are re-enriched + re-embedded selectively.
For cost-free dry runs (no LLM, no embeddings), pass --no-enrich.
2. (Optional) Mine observed queries
get_example_queries returns SQL agents (and humans) have actually run against your tables. To populate it, run mine-queries once (or on a schedule):
pg_stat_statements extension on the source database:
pg_stat_statements view (superusers always have it; non-super roles need pg_read_all_stats). If the extension or grant is missing, mine-queries exits cleanly with an actionable message — no row is written, the store stays intact, and get_example_queries keeps returning status: empty with a recovery hint pointed at this section.
3. Wire your own agent (Anthropic SDK)
The demo script atexamples/anthropic_demo.py spawns schemabrain serve over stdio, drives Claude Haiku via the Anthropic SDK’s standard tool-use loop, and prints the conversation transcript. This is the same path Claude Desktop takes internally (stdio MCP + tool-use loop), just without the chat UI.
The example script ships in the source repo but not in the PyPI wheel. If you installed via pip, grab the script first:
uv run):
is_error=False) carrying status:"error" and a structured recovery block — so the agent reads it and pivots instead of failing. PII blocks behave the same way: ask “count our users, broken down by email address” and get_metric(user_count, group_by=["user.email"]) comes back status:"refused", kind:"pii_blocked" — grouping by a PII column projects raw values into the result rows (row-level disclosure), so the firewall refuses regardless of policy and the agent reports the block instead of leaking the list. Phrase it as an aggregation, not “list the emails” — the latter reads as a raw-row export (which SchemaBrain has no tool for), so a careful agent self-declines before the firewall is ever exercised.
The script is bounded by --max-turns (default 8) and aborts cleanly if the agent doesn’t converge. Cost on Haiku 4.5 is typically $0.005–0.02 per run.
Use it to:
- Verify your install before debugging Claude Desktop config.
- Smoke-test SchemaBrain in CI.
- Crib the agent loop into your own application.
4. Logs
SchemaBrain has a single, deliberately simple logging system: one stream, stderr. No log files, no rotation, no JSON output. The default level isWARNING, so healthy runs are essentially silent. Raise the level when you need to debug; lower it when the noise gets in the way.
How you raise the level depends on how SchemaBrain is running.
When you’re running schemabrain in a terminal
Pass -v flags. Counted — more vs, more output:
When Claude Desktop is launching schemabrain serve for you
You don’t have a terminal. Claude Desktop spawns the server in the background, so -v is not an option — there’s no command line you control.
Instead, you set an environment variable in Claude Desktop’s config file (not SchemaBrain’s — SchemaBrain has no config file). On macOS that file lives at:
| Field | What it does |
|---|---|
command | The schemabrain binary to launch. Find your path with which schemabrain. |
args | Command-line arguments — same as if you typed them in a terminal. |
env | Environment variables for the spawned process. SCHEMABRAIN_LOG_LEVEL is the only logging-relevant one. |
SCHEMABRAIN_LOG_LEVEL accepts DEBUG, INFO, WARNING, ERROR, CRITICAL (case-insensitive). Unrecognized values fall back to WARNING and emit a one-line warning to stderr so you know there was a typo.
Where the lines actually appear: Claude Desktop captures the stderr of every MCP server it spawns. On macOS, read it with:
serve — you’ll see the traceback there even though no terminal was ever open.
Precedence
If you happen to set both a-v flag and SCHEMABRAIN_LOG_LEVEL, the flag wins. The env var is a fallback for the case where you can’t pass flags (i.e. Claude Desktop). In a terminal, prefer the flag.
What’s deliberately not there
| Not implemented | Why |
|---|---|
| Log files | One stream is simpler. Use shell redirection if you need persistence. |
| JSON / structured logging | Only one call site exists today (the MCP boundary catch). Will revisit if more land. |
| Per-module level overrides | SchemaBrain has one namespace. Third-party loggers (mcp, anyio, httpx, httpcore, fastembed) are pinned at WARNING regardless of our level so -vv doesn’t drown you in SDK chatter. |
| Log rotation | Not our concern. Claude Desktop manages rotation of mcp-server-schemabrain.log on its side; for terminal runs, your shell redirect manages the file. |
5. Troubleshooting
| Symptom | Cause | Fix |
|---|---|---|
error: ANTHROPIC_API_KEY not set | Key not exported | export ANTHROPIC_API_KEY=sk-ant-... |
ModuleNotFoundError: No module named 'psycopg2' | Used postgresql:// scheme | Switch to postgresql+psycopg:// |
| Claude Desktop doesn’t see the tools | Config path or syntax error | Tail ~/Library/Logs/Claude/mcp*.log |
error: could not open store at './schemabrain.db' | Relative path in Claude Desktop config | Use absolute paths |
Agent loop hits max-turns cap | Question too broad or store under-indexed | Re-run index first; ask a more specific question |
Tool returns isError=True with “is not in the store” | Store source ID doesn’t match --source URL | --source must match the URL passed to index exactly |
Inspecting tool shapes with the official MCP Inspector
The Model Context Protocol team publishes an interactive inspector that connects to any MCP server over stdio. It’s the cleanest way to see the JSON schemas SchemaBrain exposes — including the per-arg descriptions — without needing Claude Desktop, Cursor, or the Anthropic SDK.- Verify the server boots cleanly against your store + connection.
- See exactly what shape an agent will see when it calls
find_relevant_tables,describe_table,describe_column,get_example_queries, orsuggest_joins. - Trigger each tool with hand-crafted args and read the structured
ToolResponseenvelope directly.
DATABASE_URL must be exported in the same shell that runs the npx command, since the inspector spawns schemabrain serve as a subprocess and inherits your environment.
6. Validating SQL Claude generates
SchemaBrain gives Claude rich context, but it doesn’t run the SQL. The agent produces queries that should be correct — but you should still verify before trusting the output, especially on real data. A four-step ladder, cheapest to most thorough:Step 1 — does it execute?
Pipe the SQL throughpsql (or docker exec -i sb-pg psql ... if you don’t have host psql). Any syntax error, wrong column, or wrong qualified name fails here in milliseconds:
Step 2 — is the query plan sane?
Catches accidental cartesian products, missing indexes, and surprise sequential scans on large tables:Step 3 — does it produce the expected numbers on a known dataset?
This is the killer check. Construct a tiny dataset where you can hand-compute the right answer, then run Claude’s query and compare. Especially valuable when the agent flagged a caveat — a real test makes the caveat empirical. Worked example with the bundled e-commerce fixture: Claude warned that products in N categories would have their spend counted N times. Seed three orders, then compare the per-customer total spend computed two ways — through Claude’s per-category query vs an independent reference that ignores categories:Step 4 — sanity-check against production sample
For NULL handling, deprecated rows, and weird data shapes that only appear in production.LIMIT 100 against prod, eyeball the output. No automated rule catches every category of issue this stage does.
What’s next
- First 5 queries — exercises each load-bearing firewall property in ~10 minutes.
schemabrain eval— score retrieval quality against the bundled SaaS golden set.- Re-run
schemabrain indexwhenever your schema changes — it’s idempotent and cache-aware.