How Canary works

Canary watches every message and file operation in your Claude Code session and counts the PII it sees. This page explains how the pipeline actually works under the hood.

The detection pipeline

When you send a message in Claude Code, or Claude reads / writes a file as part of an agentic task, Canary processes the content through two stages:

Regex stage — fast, deterministic pattern matchers with checksum validators run inline. They catch structured PII (credit cards, SSNs, IBANs, AWS keys, crypto addresses, etc.) with high precision because the validators reject anything that doesn’t satisfy the appropriate checksum.
Semantic stage — Claude itself scans the same content for unstructured PII (names, addresses, medical records, legal documents, trade secrets, etc.). This runs asynchronously so your workflow isn’t blocked.

Each detection is redacted, categorised, timestamped, and appended to a local findings file.

message → regex stage  →  validated matches  →  redact  →  ~/.sonomos/leaks.jsonl
                          ↓
                       semantic stage (async)
                          ↓
                       claude self-scan  →  redact  →  ~/.sonomos/leaks.jsonl

Why both stages?

Neither stage on its own is sufficient:

Regex without checksums would produce way too many false positives. Canary’s regexes are paired with validators (Luhn for credit cards, MOD-97 for IBANs, EIP-55 for Ethereum addresses, MOD-11 for VINs, Base58Check for Bitcoin) so what gets recorded is genuinely high-confidence.
Regex alone can’t see a patient’s name in a stack trace, the gist of a medical record, or the structure of a legal contract. Those need semantic understanding — which Claude already has, so Canary uses it.

The combination is the trick: deterministic precision on structured PII, plus model-grade understanding on everything else.

Asynchronous semantic scanning

The semantic stage runs in the background. Practically, that means:

Your prompt is not delayed waiting for the model to self-scan.
Detection results appear in ~/.sonomos/leaks.jsonl shortly after each message, not necessarily at the exact moment you send.
The counter in your status line updates within a few seconds of a message containing PII being processed.

If you’re stress-testing detection, give Canary a moment between the prompt and /canary:leaked stats so the async semantic results land.

What Canary stores

Every detection is recorded as a redacted, structured event. The on-disk format is JSON Lines at ~/.sonomos/leaks.jsonl:

The category of the match (e.g. aws_access_key, medical_record, email).
A redacted value — only the first two and last two characters of the original are kept; the middle is replaced with ••. For example, AKIAIOSFODNN7EXAMPLE becomes AK••••LE.
A timestamp.
A source hint (which message or file the match came from), with no original content retained.

The redacted form is enough to recognise repeated leaks of the same identifier without ever storing the identifier itself.

Local-only, by design

Canary is built on the same local-first principle as the rest of Sonomos:

No network requests. Canary never phones home. No telemetry, no analytics, no accounts.
Owner-only file permissions. ~/.sonomos/ is created with mode 0700; files within are 0600.
Injection-safe JSON. Output is constructed via jq rather than string concatenation, so values cannot break out of the JSON.
Path-traversal hardened. File paths are validated before any read or write, so a maliciously-crafted finding cannot escape the storage directory.

If you uninstall the plugin, the data stays on your machine until you delete ~/.sonomos/.

”The number only goes up”

The Canary status line counter is intentionally monotonic. It doesn’t decrement when you delete history or when findings age out, because the whole point is to give you an honest cumulative picture of how much PII you’ve put in front of Claude.

You can reset history any time with /canary:leaked reset — the displayed counter will then start fresh — but in normal use, the goal is for it to be a slightly uncomfortable, always-visible reminder.

Performance

The regex stage is effectively instantaneous (low microseconds per pattern, single-digit milliseconds for a long message).
The semantic stage uses Claude’s existing context, so it doesn’t add new model calls beyond what your session is already doing.
Disk writes are append-only to a JSONL file — there’s no database, no fsync churn.

You should not notice any slowdown in normal Claude Code sessions.

What Canary does not do

To be precise about the scope:

It doesn’t block or rewrite prompts. Cloak-style masking is a job for the Sonomos browser extension (today, on the web) and Sonomos Desktop (coming, system-wide). Canary is observability.
It doesn’t audit Claude’s responses. Canary records what you sent. Anthropic’s own retention and safety policies govern what Claude does with that content.
It doesn’t ship a SIEM integration. Findings are local. You can pipe ~/.sonomos/leaks.jsonl into whatever you’d like — see Commands & dashboard for the JSON-export CLI.

Next steps

Commands & dashboard — the slash commands and CLI tools you’ll use day-to-day.
Privacy & security — a closer look at the threat model and what guarantees Canary makes.