Skip to content

Dagger — Detection Engine

Dagger is Sonomos’s PII detection engine. It uses a combination of advanced pattern analysis and on-device AI to identify sensitive data across 62+ categories — all running locally in your browser.

How detection works

When text appears on a page — whether typed, pasted, or loaded — Dagger processes it through multiple detection layers:

Pattern analysis

Deterministic matchers built for structured PII formats. These are fast, precise, and include built-in validation (such as checksum verification for financial identifiers). Examples include:

  • Social Security numbers
  • Credit card numbers
  • Email addresses
  • Phone numbers (US and international)
  • IP addresses
  • Dates of birth
  • Medical record numbers
  • Driver’s license numbers
  • Passport numbers

AI-powered recognition

On-device small language models identify unstructured PII that pattern analysis alone can’t catch:

  • Person names in free text
  • Organization names
  • Location references
  • Context-dependent identifiers

All AI models run locally on your device — no data is sent to external servers for analysis.

Detection categories

Dagger currently includes 62 detectors with comprehensive test coverage. Each detector classifies matches into severity tiers:

SeverityColorExamples
High🔴 RedSSN, credit card, passport, medical record
Medium🟡 AmberFull name, date of birth, address
Low🟢 GreenEmail, phone number, IP address

Severity drives the risk widget color and determines whether Cloak auto-masks or prompts the user.

Image and document detection

For PII embedded in images (screenshots, scanned documents), Dagger uses optical character recognition to extract text before running it through the same detection pipeline. PDF content is also parsed and analyzed automatically.

Known limitations

  • ZIP code detection: Some edge cases with street numbers and ZIP codes in certain address formats. An improved version is in development.
  • Short-token false positives: Very short common words may occasionally be flagged. These are excluded from Cloak masking automatically.