Dagger — Detection Engine
Dagger is Sonomos’s PII detection engine. It uses a combination of advanced pattern analysis and on-device AI to identify sensitive data across 62+ categories — all running locally in your browser.
How detection works
When text appears on a page — whether typed, pasted, or loaded — Dagger processes it through multiple detection layers:
Pattern analysis
Deterministic matchers built for structured PII formats. These are fast, precise, and include built-in validation (such as checksum verification for financial identifiers). Examples include:
- Social Security numbers
- Credit card numbers
- Email addresses
- Phone numbers (US and international)
- IP addresses
- Dates of birth
- Medical record numbers
- Driver’s license numbers
- Passport numbers
AI-powered recognition
On-device small language models identify unstructured PII that pattern analysis alone can’t catch:
- Person names in free text
- Organization names
- Location references
- Context-dependent identifiers
All AI models run locally on your device — no data is sent to external servers for analysis.
Detection categories
Dagger currently includes 62 detectors with comprehensive test coverage. Each detector classifies matches into severity tiers:
| Severity | Color | Examples |
|---|---|---|
| High | 🔴 Red | SSN, credit card, passport, medical record |
| Medium | 🟡 Amber | Full name, date of birth, address |
| Low | 🟢 Green | Email, phone number, IP address |
Severity drives the risk widget color and determines whether Cloak auto-masks or prompts the user.
Image and document detection
For PII embedded in images (screenshots, scanned documents), Dagger uses optical character recognition to extract text before running it through the same detection pipeline. PDF content is also parsed and analyzed automatically.
Known limitations
- ZIP code detection: Some edge cases with street numbers and ZIP codes in certain address formats. An improved version is in development.
- Short-token false positives: Very short common words may occasionally be flagged. These are excluded from Cloak masking automatically.