Observability
The observability layer is a thin substrate that exposes everything SchemaBrain does in response to an MCP request, so a human can see — in real time — what an agent is touching, what got refused, and what got returned. It has three pieces:- An internal event bus. A
JsonlEventBusthat any tool handler canemit(Event(...))into. - A user-facing tail.
schemabrain tailreads the bus and pretty- prints each event so the operator can watch alongside the agent. - An optional OTel exporter. The
schemabrain[otel]extra ships each tool call as onegen_ai.execute_toolspan to any OTLP/HTTP-speaking backend — Langfuse, Phoenix, OpenLIT, otel-tui, Datadog. Off by default; activated by settingOTEL_EXPORTER_OTLP_ENDPOINT.
What gets logged
Every tool call emits exactly one event on completion (success or failure). Server lifecycle (start, stop, schema-version mismatch) emits a separate event kind.Tool-call event shape
| Field | Meaning |
|---|---|
timestamp | ISO 8601 UTC with microsecond precision and trailing Z. |
server_session_id | UUID generated when the serve process started. Use this to group events across a single serve run. |
kind | "tool_call" for one of the 12 MCP tools; "server_event" for lifecycle markers. |
tool_name | The MCP tool name (e.g. describe_table, get_metric). |
args_summary | The keyword arguments the agent passed, after redaction (see below). |
status | Mirrors the Charter response envelope: success / empty / partial / degraded / error / refused. |
error_kind | When status is error or refused, the structured error kind (e.g. unknown_name, pii_blocked). |
duration_ms | Wall-clock latency of the tool call. |
result_summary | A small per-tool dict — counts, fingerprints — extracted from the response data. |
Server-event shape
event_subtype is one of:
server_start— emitted before the stdio transport accepts the first request.server_stop— emitted in afinallyblock, soKeyboardInterruptstill produces a stop event.schema_version_mismatch— reserved for a future check.events_path_init— reserved for a future check.
Redaction
Tool arguments pass through anEventRedactor BEFORE the event line
hits disk. Four rules apply per-value (keys are never modified):
- Connection URLs — any string matching
^(postgresql|postgres|mysql|sqlite)(\+\w+)?://becomes<redacted-connection-url>. - Long strings — anything larger than 2 KiB becomes
<truncated:N bytes>. get_metricfilter values — every value inside afiltersdict becomes<value>(filter values are user PII by default — email, customer id, etc.).- Email-shaped strings — anything matching
^[^\s@]+@[^\s@]+\.[^\s@]+$becomes<email>.
File layout
The default path is~/.schemabrain/events.jsonl. Override with
--events-path PATH (on both serve and tail) or the
SCHEMABRAIN_EVENTS_PATH environment variable. Flag wins over env,
env wins over default.
The directory is created mode 0700, the file mode 0600 — same
posture as the host config from schemabrain init.
The file rotates at 10 MiB. On overflow:
- The active file is renamed to
<path>.1. - A fresh active file starts on the next emit.
- Only one rotation is kept; older
.1files are dropped.
schemabrain tail follows the active file and detects rotation via
inode change, re-opening the new file when it appears.
Failure semantics
The bus is lossy by design. Ifemit() fails — disk full, permission
revoked, anything — the failure is caught, logged once per error-kind
to stderr, and the event is dropped. The agent’s tool call still
returns normally; we never fail a request because the log layer
failed.
A durable-store consumer is on the roadmap. That path will guarantee
durability for higher-trust events while the JSONL tail remains lossy
by design.
CLI cheat sheet
Audit log (alpha)
Alongside the lossy JSONL bus, every MCP tool call writes one row to themcp_audit table inside the local SQLite store. The table is
append-only by three independent mechanisms — SQL triggers, a
write-only writer connection, and a per-row sha256 chain hash that
makes coherent tampering detectable against any external archive that
captured a prior hash.
The shape, the privacy guarantee, and the regulatory backing for the
PII taxonomy are documented in
ADR 0001. Per that ADR, the
fingerprint primitive carries no row content, no column values, and
no identifying schema info — only structural metadata.
Inspecting the table
Durability
The audit writer uses WAL +synchronous=NORMAL — the same posture
as the rest of the SQLite store. A crash within milliseconds of a
write can lose the last few rows; the chain hash still keeps the rest
of the table tamper-evident. Stricter fsync-on-write durability lives
on the roadmap.
Single-process constraint
Run oneschemabrain serve instance per store file. The audit
writer holds an in-memory _last_chain_hash that is recovered from
the table tail on startup. Two serve processes against the same
store would each compute the next id independently and race the
INSERT — the second loses with a UNIQUE constraint failure that
surfaces as a stderr BUG line and silently drops the audit row.
If you need horizontal scale, separate the source databases (one
store per source) until v3 hosted brings a multi-writer audit plane.
Opting out
serve falls back to no-audit with a stderr warning —
the server is more useful without audit than not at all.
PII classification (alpha)
schemabrain index runs a heuristic classifier over every column it
profiles, tagging columns with categories from the closed
12-category taxonomy in
ADR 0001. The taxonomy is
regulator-derived (GDPR Arts. 4 + 9, CCPA/CPRA, HIPAA Safe Harbor,
PCI DSS, ISO 27018) so categories map cleanly onto compliance
boundaries instead of conflating them.
Tags live in a column_pii_tags SQLite table keyed by
(source_connection_id, qualified_table, column_name). get_metric
bulk-reads tags for every column a plan touches (measure + time
bucket + group_by + filters), propagates by MAX-sensitivity +
UNION-categories (ADR §4), and writes the propagated set into the
audit row’s pii_categories column. Two get_metric calls touching
different category sets therefore produce distinct fingerprint
digests — the field that pre-PR-#36 was a v1 constant.
Enforcing a refusal policy
--pii-block applies the always-on catastrophic-leak floor
(credential, payment_card, government_id) — enforcement is on by
default; pass --pii-block <csv> to widen it. Tags flow through to
mcp_audit.pii_categories regardless. Unknown category names abort
startup with a clear error listing the 12 valid values — typos in
the operator config never silently fall through to “no PII
protection”.
A blocked call returns a Charter status="refused" envelope with
error.kind="pii_blocked". The audit row records:
status='refused'refusal_reason='pii_blocked'pii_categories= the attempted set (so the audit shows what was touched, not just that something was blocked)cost_class='refused'
--pii-block with --no-audit emits a one-shot stderr warning at
startup: enforcement still happens (the agent sees the refusal
envelope), but the refused row never lands in mcp_audit.
Opting out of classification
get_metric audit rows then record pii_categories='' and any
--pii-block policy on serve has nothing to act on. Use only
when local tag inference itself is unwanted (privacy-paranoid
environments).
Integrating with existing observability stacks
Two paths, picked based on what you already run.Path 1 — tail the JSONL into a log shipper
Path 2 — emit OTel spans
Install the extra and set the OTLP endpoint:execute_tool carrying:
gen_ai.system = "schemabrain"gen_ai.tool.name= one of the 12 MCP tool namesgen_ai.tool.call.id= audit fingerprint hex when presentschemabrain.session.id= the serve process UUIDschemabrain.status= Charter envelope statusschemabrain.error_kind= error kind on failureschemabrain.duration_ms= wall-clock latencyschemabrain.result.*= numeric / string fields from the tool’s result summary (matches, columns, paths, rows, etc.)
Charter status | OTel status |
|---|---|
success, empty, partial, degraded | OK |
error, refused | ERROR (description = error_kind) |
OTEL_EXPORTER_OTLP_HEADERS, OTEL_EXPORTER_OTLP_TIMEOUT,
OTEL_TRACES_SAMPLER, etc.). When the extra is installed but the
endpoint env var is unset, span emission is silently off — zero
overhead. When the env var is set but the extra is not installed, a
one-shot stderr warning fires so the misconfiguration is visible.
The implementation never fails a tool call because of an OTel error:
exporter network failures, missing endpoints, or attribute-setting
crashes all degrade to a single stderr line and the agent gets the
tool response unchanged.
Backend-specific recipes
Langfuse (self-hosted or cloud):Limits
- Spans are NOT linked into any parent trace your agent harness is
emitting.
serveis a separate process; OTel context propagation across stdio MCP transport isn’t standardised yet. Each tool call produces an orphan span — fine for backend dashboards, not for end-to-end agent traces. - Tool arguments are NOT attached to spans, even after redaction.
The events file (
~/.schemabrain/events.jsonl) carries redacted args under tighter trust boundaries; span exporters can land in dashboards you don’t fully control. - The
gen_ai.*semantic conventions are still experimental as of the OTel spec. One attribute-rename migration is expected before GA; we’ll ship the new names in a minor release when the conventions stabilise.