PII policy
The PII firewall has two halves: a block set (which categories the firewall refuses at query time) and per-column overrides (operator assertions that the heuristic classifier got a specific column wrong). Both live in one YAML file in your project:./schemabrain/pii_policy.yaml.
This page walks through the layout, the CLI, and the common patterns.
Where files live
A SchemaBrain project on disk looks like this:models/ directory.
schemabrain init --emit-yaml-dir ./schemabrain writes the full
directory layout, including a starter pii_policy.yaml seeded with
the catastrophic-leak defaults. schemabrain apply ./schemabrain
reads every YAML in the directory (entities, metrics, joins, AND
pii_policy.yaml) back into the store.
The policy file
A minimalpii_policy.yaml:
block
A list of PIICategory values. When a column with
any of these categories appears in a get_metric plan, the firewall
returns a pii_blocked refusal instead of executing the query. The
describe_* family always blocks the catastrophic-leak floor
(credential, payment_card, government_id) regardless of block
— that’s the minimum-decency line, not a policy setting.
block: [] (an explicit empty list) reduces the operator policy to
the always-on catastrophic-leak floor (credential, payment_card,
government_id) — it does not disable enforcement. get_metric and
every describe_* gate union that floor into the effective block
regardless of block, so a tagged credential / payment-card /
government-id column is still refused; you cannot drop below the floor
via YAML. An empty block simply means “add nothing beyond the floor”.
PII tags still flow into the audit row either way. Use it once you’ve
classified your data and confirmed analytics-only access is appropriate
for every non-floor category.
column_overrides
A mapping from schema.table.column to a sensitivity +
categories pair. Each override replaces the heuristic classifier’s
output for that one column.
The most common pattern is downgrading an over-tagged column:
the heuristic flags card_number_last4 as payment_card from the
column name, but PCI-DSS explicitly allows storing the last four
digits. Without an override, every analytics query that touches
this column would refuse.
public by default. If you know
public.app_state.session_token carries credential data, assert it:
description
Optional free-text annotation. Carried through git diffs so reviewers
see why a policy change landed without having to read the PR body.
Recommended for any non-obvious downgrade — future-you and your
security reviewer will both appreciate the breadcrumb.
The CLI
schemabrain policy show
Print the current policy: active block set, per-column tag listing
with provenance.
* marker flags operator-asserted overrides. The verdict column
distinguishes allowed (neither your policy nor the floor blocks it),
floor-blocked (not in your active policy, but blocked by the always-on
catastrophic floor — enforced at every read gate, describe_* and
get_metric, so it can’t be disabled), and blocked (the category is in
your active policy block).
schemabrain policy apply <yaml>
Load a pii_policy.yaml into the store. Writes the
column_overrides as origin='operator' rows; the block: field
stays in the YAML (read by serve at startup).
schemabrain policy tag override
Apply one column override without editing YAML. Useful for quick
experiments; for permanent state, prefer the YAML round-trip so the
override is git-tracked.
schemabrain policy tag clear
Remove a specific operator override. The heuristic row underneath is
untouched — next schemabrain index will re-classify the column.
schemabrain policy tag list
Provenance-filterable listing of every PII tag for the source.
The serve integration
schemabrain serve reads ./schemabrain/pii_policy.yaml at startup
when --pii-block is omitted:
--pii-block always wins:
--policy-path PATH lets you point at a different file (e.g. a
per-environment policy in CI):
Common patterns
Downgrading a false positive
The heuristic flags any column whose name containscard,
account_number, etc. as payment_card. If the column genuinely
isn’t payment data (e.g. gift_card_design_id), downgrade it:
Locking down a custom credential column
If your schema carries credential data in a column the heuristic doesn’t recognise (e.g.meta.tenant_api_token on a generic table),
escalate it:
Dev databases with synthetic data
There is no switch that turns enforcement off entirely: the catastrophic-leak floor (credential, payment_card,
government_id) is always-on by contract, even with block: []. An
empty block only means “add no operator-policy categories beyond the
floor” — a floor-tagged column is still refused, and per-column tags
still flow into audit rows so you can see what was touched.
If a specific column is genuinely synthetic and you want it queryable,
reclassify just that column with a column_overrides
entry rather than expecting an empty block to expose it.
Per-environment policies
Keep one YAML per environment, swap with--policy-path:
PII categories
The 12 categories the heuristic + override layer can express:| Category | Examples |
|---|---|
contact | email, phone, address |
financial | account balance, transaction amount |
payment_card | full PAN, CVV, expiry |
health | diagnosis, medication, lab result |
genetic | DNA sequence, genetic test result |
biometric | fingerprint hash, face embedding |
behavioral | clickstream, page visit |
online_identifier | IP, user agent, device ID |
credential | password hash, session token, API key |
government_id | SSN, tax ID, passport number |
location | GPS coordinates, IP-derived geo |
demographic_protected | race, religion, sexual orientation |
{credential, payment_card, government_id}. These are always refused by describe_* regardless
of block: — the operator can downgrade individual columns via
column_overrides but can’t turn off the floor.
Workflow recap
schemabrain init --emit-yaml-dir ./schemabrain— wizard indexes your database and emits the starterpii_policy.yamlalongside the entity/metric/join directories.- Edit
./schemabrain/pii_policy.yamlin your editor — adjustblock, addcolumn_overrides. schemabrain policy apply ./schemabrain/pii_policy.yaml— persist overrides to the store. Or useschemabrain apply ./schemabrainto apply everything in one pass.- Restart
schemabrain serve— picks up the newblockfrom the YAML. - Commit the YAML to git. Security review happens in PR.
Related
- Threat model — how the block set + catastrophic
floor compose with the
get_metricanddescribe_*enforcement paths. - Setup — full project initialisation walkthrough.