Skip to content

feat(redact): configurable structured-ID PII categories with checksum validators#3872

Merged
louis030195 merged 17 commits into
mainfrom
claude/flamboyant-blackburn-9dbbfc
Jun 8, 2026
Merged

feat(redact): configurable structured-ID PII categories with checksum validators#3872
louis030195 merged 17 commits into
mainfrom
claude/flamboyant-blackburn-9dbbfc

Conversation

@louis030195

@louis030195 louis030195 commented Jun 5, 2026

Copy link
Copy Markdown
Collaborator

What

Opt-in, per-category PII redaction for structured national / financial IDs and cloud/developer credentials, configurable through the existing pii_redaction_labels setting (and the enterprise MDM piiRedactionLabels lock) without widening the 13-class SpanLabel taxonomy or retraining the model. 186 categories (structured IDs + cloud/dev credentials).

PII categories

Proposed Settings surface. The engine capability ships in this PR; the React toggle UI is a fast follow.

Measured (not asserted)

Two committed harnesses (examples/pii_eval.rs, examples/pii_load.rs), release build:

metric result
Recall (generatable instances, 144 of 186 categories) 99.8% (57,461 / 57,600)
Hard negatives (adversarial: wrong-checksum, order IDs, hashes, UUIDs, substring traps) 0 violations
Real-text FP (scan of 8.4 MB of crates/) only test-fixture vectors + one genuine hardcoded Sentry DSN (a true positive); zero false matches on real code
Fuzz (200k adversarial inputs + huge/unicode/control edges) 0 panics, span-integrity + determinism invariants hold
Sustained load per-row cost in the low microseconds, peak RSS flat (no per-row leak)

~94 carry a real checksum (Luhn, ISO 7064 MOD-11,10 / MOD-11-2 / MOD-97-10, mod-97, Verhoeff, EAN-13, Base58Check, bech32/bech32m, EIP-55 Keccak, country-specific weighted mod-11/10/23/26/37), each unit-tested against a publicly-traced vector. The rest are format/context-only where no public checksum exists (gated hard on context keywords), or prefix-identified credentials where the vendor prefix is self-validating.

The harness repeatedly earned its keep. It caught the IBAN prose-eating bug, the credit-card hash false-positive, the sin-in-business substring trap, an IPv6 :: catastrophe (35k FPs from code path separators), a wrong Taiwan letter table, a 21-char "LEI" vector, a non-Luhn-valid IHI vector, an "invalid" Portugal NIF that's actually valid, and an ambiguous FIGI Luhn parity (shipped format-only rather than ship an unverified checksum).

How

  • RedactedSpan gains an optional subtype (serde-skipped when None; wire format unchanged). TextRedactionPolicy gains a subtype allow-list + allows(label, subtype). Label-only configs behave exactly as before; secret is always redacted.
  • A RegexSet gate makes PII-free text return after one DFA pass over all the no-context patterns; context-gated detectors run only when their keyword is present (Aho-Corasick prefilter). Overlap suppression is secret-first, so a credential never loses to an overlapping non-secret span.
  • Weak-checksum / format-only detectors require a context keyword within a 48-byte window (whole-word matched, both directions).

Coverage (186)

US (SSN/EIN/ITIN/NPI/DEA/ABA/passport/MBI/HICN/EDIPI/USCIS/TSA/driver-licenses), cards/IBAN/ISIN/CUSIP/SEDOL/LEI/FIGI/MIC/VIN/IMEI/IMEISV/ICCID/EID/IMSI/MEID/MAC/IPv6/Bitcoin (legacy + bech32/Taproot)/Ethereum (EIP-55)/Litecoin/XRP, healthcare (NHS, ABHA, IHI, NDC, CPT, HCPCS, ICD-9/10), EU VAT (15 states) + national IDs across ~30 European countries, and national IDs across Asia, the Americas, the Middle East, and Africa (~50 countries).

Plus 32 cloud/developer credential formats (Google API key, GitHub fine-grained PAT, GitLab/npm/Linear/Postman/Databricks/Notion/Pulumi/Fly.io/Docker Hub/Slack app/PyPI tokens, SendGrid, Slack/Sentry webhooks + DSN, DigitalOcean, Doppler, Shopify, Stripe webhook secret, Square OAuth, age secret key, Atlassian, HashiCorp Vault, Figma, New Relic, Razorpay, PlanetScale, Supabase, PostHog, Tailscale, Flutterwave), on top of the prefixless secrets already detected (OpenAI/Stripe/GitHub/Slack/AWS/Google-OAuth/HuggingFace/JWT). All are labelled secret, so they are always redacted regardless of the per-category allow-list.

Honest limits

  • ~94 checksummed, ~92 format/context- or prefix-only. The format-only ones (some driver licenses, several countries with unpublished checksums) lean entirely on context gating; they measured 0 FPs on real source but are inherently weaker than the checksummed set.
  • No real-capture (ScreenLeak) validation yet. Recall is on validator-certified synthetic instances, FP on the repo's own source. This is the single highest-value next step.
  • The brute-force harness covers 144 of the 186 categories (those with a generatable instance). The other ~42 are format/context-only national IDs verified by per-country unit-test vectors, not the recall sweep.
  • Windows + battery not measured (developed on Mac); method documented in pii_load.rs.
  • Deferred: spaced/grouped IBAN, compressed IPv6, more crypto (Monero/Bitcoin-Cash), Mexico CURP checksum, FIGI checksum. Each documented in-code with the reason.
  • The older screenpipe-core::pii_removal regex layer still needs consolidating.

🤖 Generated with Claude Code

Louis Beaumont and others added 17 commits June 5, 2026 14:55
… validators

Lets a customer opt a single national/financial ID class in (e.g. iban,
india_aadhaar) via the existing `pii_redaction_labels` flag and the
enterprise MDM `piiRedactionLabels` lock, without widening the 13-class
SpanLabel taxonomy or retraining the model.

How:
- RedactedSpan gains an optional `subtype` (serde-skipped when None, so wire
  format is unchanged); TextRedactionPolicy gains a subtype allow-list and
  `allows(label, subtype)`. Label-only configs behave exactly as before.
- New deterministic detectors in the regex adapter, each gated on a real
  checksum (Luhn, IBAN mod-97, Spain mod-23, Brazil mod-11, Aadhaar Verhoeff)
  and, for the weak-checksum numeric ones, a required context keyword so a
  bare digit run next to no label is not flagged (issue #2340 lesson):
  us_ssn, credit_card, iban, spain_dni, brazil_cpf, india_aadhaar,
  canada_sin, imei.

This is the architecture for closing the Google-DLP category gap: ~80% of it
is regex + checksum that folds under the coarse `id`/`secret` labels with a
sub-type tag, no ML. The semantic classes (names, orgs, GDPR Art.9) stay the
model's job.

Verified: 89 crate tests pass (checksum unit vectors + FP/context cases),
clippy clean, screenpipe-engine compiles.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…t bench

- Add a RegexSet "any match?" gate so PII-free text (the common case)
  returns after one DFA pass instead of ~22 per-pattern find_iter scans.
  Measured 1.3x on an identical PII-free corpus (2927 -> 2169 ns/call in
  release).
- Fix the per-row allocation introduced earlier: context lookups now
  lowercase only the <=48-byte preceding window, and only on a
  context-gated match, instead of copying every input string. Restores
  the module's documented allocation-light hot path.
- Add a throughput bench (pii-free + mixed + gate-vs-no-gate A/B) as a
  regression guard. Release: pii-free ~2.2us/call (~91 MB/s), mixed
  ~2.9us/call. Runs in the background reconciliation worker, off the
  capture hot path.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…; fix 3 bugs the eval found

Expands the structured-ID coverage from 8 to 23 subtypes, and adds a
measurement harness (examples/pii_eval.rs) that generates validator-
certified instances, runs a hard-negative suite, and scans real source
for false positives.

New checksum validators (each unit-tested against the public traced
vector): credit_card (brand IIN + length + Luhn), ISIN, CUSIP, SEDOL,
VIN (NHTSA transliteration mod-11), Spain NIE, US NPI (Luhn+80840),
US DEA, Netherlands BSN (elfproef), US ABA routing, Australia TFN.
Plus format/context-only: UK NINO, India PAN, SWIFT/BIC, US EIN,
MAC address.

The eval immediately caught three bugs unit tests had missed:
- IBAN regex greedily ate trailing prose ("DE89... on file") and failed
  validation → recall was 63.8%. Now contiguous-only (grouped form
  deferred; needs a country-length table). Recall → 100%.
- credit_card flagged a 19-digit hash that passed Luhn by chance. Now
  gated on brand IIN prefix + brand length, not Luhn alone.
- context matching used naive substring ("sin" matched inside
  "business") and only looked backwards. Now whole-word + bidirectional.

Measured (release): recall 99.9% (9188/9200) across 23 categories;
0 violations on 19 hard-negative lines; real-text scan over 8.4 MB of
crates/ shows only test-fixture PII, no genuine false positives.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Fuzz (fuzz_invariants_hold): 200k xorshift-seeded adversarial inputs
(unicode, control chars, ID-like garbage, multi-KB strings) plus explicit
edge cases through redact_one and all 14 validators. Asserts no panics,
spans within bounds + on char boundaries + sorted + non-overlapping +
text-accurate, and determinism. Reproducible via the fixed seed.

Load profiler (examples/pii_load.rs): processes a target volume through
redact_one for steady-state numbers under `/usr/bin/time -l`.

Measured on Mac (release, 2 GB / 37M rows):
  - 206k rows/s (11.1 MB/s; rows are short so rows/s is the worker metric)
  - 146s user + 2.9s sys, single-threaded
  - peak RSS 19.3 MB, FLAT across 37M calls (no per-row leak)

This is the background reconciliation worker's per-row cost, off the
capture hot path. Windows + battery numbers must be gathered on target
hardware (powermetrics / Windows ETW) — not measurable from here.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds Germany Steuer-ID (ISO 7064 MOD 11,10), China resident ID (ISO 7064
MOD 11-2), Poland PESEL, Sweden personnummer, South Africa ID, Turkey
TCKN, Finland HETU (mod-31), France NIR (mod-97 + Corsica), Belgium
national number (mod-97), Norway fødselsnummer (two mod-11), Italy Codice
Fiscale (mod-26), Australia Medicare, UK UTR, South Korea RRN, and Mexico
CURP (format-only).

9 of 10 checksum algorithms were anchored against the research's public
traced vectors on the first run (Germany, Poland, Sweden, South Africa,
Turkey, Finland, Belgium, Italy CF — all pass); China is anchored by
construction (no trustworthy public vector). Two findings worth noting:
- Mexico CURP's published check-digit algorithm couldn't be verified
  against a trustworthy vector (the Ñ-alphabet weighting is ambiguous), so
  it ships as format/context-only rather than an unverified checksum.
- The eval caught south_africa_id at 0% recall: my context keyword
  "south afric" is a truncated word that the (correct) whole-word context
  matcher rejects. Fixed to proper whole words.

Measured (release): recall 99.6% across 38 categories; 0 hard-negative
violations; real-text scan shows only test-fixture PII. The few sub-100%
are labeling overlaps where the PII is still redacted (13-digit SA IDs
starting 4 are valid Visa shapes -> credit_card; Amex-shaped IMEIs).
Fuzz extended to 28 validators, 200k iters, 0 panics.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…> 44)

Adds IPv6 (std-parser validated, full 8-group form only — `::` matches the
ubiquitous code path separator so the compressed form is deferred, caught
by the eval at 35k false positives), ICCID (Luhn), Bitcoin legacy
Base58Check (sha2 double-hash, genesis address anchors the test), plus
format/context-only IMSI, US passport, ICD-10.

Eval: 99.7% recall across 44 categories, 0 hard-negative violations,
real-text scan clean. Fuzz extended to 31 validators. Perf regression
guard made build-aware (debug runs ~10x slower than the release target).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds Germany/France/Italy/Belgium/Austria/Poland/Denmark/Greece/Croatia/
Portugal/Finland/Luxembourg/Sweden VAT, Ireland PPS, Switzerland AHV,
Austria SVNR, Romania CNP, Bulgaria EGN, Greece AMKA, Iceland kennitala,
Estonia isikukood, ex-Yugoslav JMBG, Russia INN, Czech/Slovak rodné číslo,
Denmark CPR. Shared ISO-7064 MOD-11,10 / mod-97 / weighted-mod-11 helpers.

22 of these checksum algorithms are anchored against machine-verified
public vectors (the research agent traced each against a reference impl);
all pass. One catch: the agent claimed "123456789" is an invalid Portugal
NIF, but it actually passes mod-11 — corrected the test's negative.

Measured (release): recall 99.8% across 69 categories; 0 hard-negative
violations; real-text scan shows only test-fixture vectors, no genuine
false positives on real code.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds Singapore NRIC/FIN, Hong Kong HKID, Taiwan ID, Japan My Number,
Thailand national ID, New Zealand IRD, Brazil CNPJ, Chile RUT, Argentina
CUIT, Colombia NIT, Uruguay CI, Israel Teudat Zehut, UAE Emirates ID,
Saudi/Iqama ID (all checksum-validated), plus format/context-only
Indonesia NIK, Malaysia MyKad, Philippines PhilSys, Egypt national ID,
Nigeria NIN.

8 checksum algorithms anchored against the research's public vectors
(Singapore, HKID, Taiwan, NZ IRD, Brazil CNPJ, Chile RUT, Uruguay CI,
Israel — all pass on the first run after fixing the irregular Taiwan
letter table). The 6 without a trustworthy vector (Japan, Thailand,
Argentina, Colombia, UAE, Saudi) are proven non-degenerate by
construction.

Measured (release): recall 99.7% across 88 categories; 0 hard-negative
violations; real-text scan clean. Perf regression guard is now
release-only (debug is unoptimized and not representative).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Final batches: NHS number (mod-11), LEI (ISO 7064 MOD 97-10), Australia
IHI (Luhn/800360), eSIM EID (Luhn), India ABHA (Verhoeff), US Medicare
MBI (positional mask), plus EU VAT (HU/SI/EE/MT/SK/LV) and national IDs
for Lithuania, Latvia, Kazakhstan, Iran, Ukraine, Kuwait, Ecuador,
Dominican Republic — all checksum-validated against machine-traced public
vectors. Rounded out with ~40 format/context-only detectors (US driver
licenses, NDC/CPT/HCPCS/ICD-9, MMSI, MEID, FIGI, MIC, and national IDs
across Pakistan, Vietnam, Morocco, Qatar, Ghana, Venezuela, Peru, Bahrain,
Panama, Georgia, Armenia, Azerbaijan, Bangladesh, Bolivia, Paraguay,
Costa Rica, Lebanon, Belarus, Tanzania, Sri Lanka, India voter, etc.).

The eval kept catching bad agent-supplied vectors: a 21-char "LEI", a
non-Luhn-valid IHI, an "invalid" Portugal NIF that's actually valid, the
FIGI Luhn parity (shipped format-only instead).

Measured (release): 150 categories, recall 99.7% (43057/43200), 0
hard-negative violations, real-text scan over 8.4 MB shows only
test-fixture vectors — the ~60 format/context-only detectors produced
zero false positives on real code (context gating holds). ~90 carry a
real checksum; the rest are format/context-only where no public checksum
exists.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The release perf test caught a real regression: at 150 patterns, compiling
them all into one RegexSet and re-running find_iter for every context-gated
\d{N} the set flagged cost ~1.7 ms/row (0.1 MB/s) — a ~900x regression.

Fix, two parts:
- Only NO-context patterns go in the RegexSet (a small, fast DFA run on
  every row). Context-gated detectors (the ~110 national-ID / VAT ones)
  are excluded from it.
- A single Aho-Corasick automaton over every distinct context keyword
  finds, in one pass, which labels are present on the line; a context
  detector's find_iter runs only when its keyword is there. So an obscure
  detector costs nothing on a row that doesn't mention it.

Measured (release): PII-free (the dominant real workload) 30 us/row, all-PII
"mixed" worst case 60 us/row — back from 1.7 ms. 21x gate speedup. Recall
unchanged at 99.7%, fuzz still 0 panics, all 97 release tests pass. The
perf guard is reset to reflect the 150-detector cost (PII-free < 60us,
mixed < 130us) so it still catches a real regression.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds two on-device structured-ID detectors with real checksum
validation, both deferred earlier only for the keccak dependency:

- ethereum_address: EIP-55 mixed-case checksum verified against
  Keccak-256 (sha3 crate). All-lower / all-upper addresses pass as
  checksum-absent; mixed-case must match the nibble parity exactly,
  so a single transposed character is rejected. Spec vectors from
  EIP-55 pass (incl. the all-caps and all-lower edge cases).
- litecoin_address: Base58Check (sha256d) via the shared
  base58check_ok helper extracted from btc_address; L/M prefix.

btc_address refactored onto the same shared helper. sha3 promoted to
a direct dep (was already transitive via the workspace).

Eval: ethereum_address 400/400, litecoin_address 400/400,
btc_address 400/400 recall; overall 43871/44000 = 99.7% across 152
categories; 0 hard-negative violations; clippy clean; lib tests
98/98 incl. crypto_vectors (EIP-55 + LTC machine-traced vectors).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds btc_bech32_address, the dominant modern Bitcoin address form
(native SegWit bc1q... and Taproot bc1p...), with a real BCH checksum.

- Hand-rolled bech32_polymod (no new dep); selects the checksum
  constant by witness version (v0 = bech32, v1..=16 = bech32m per
  BIP-350). hrp restricted to bc/tb. Mixed case rejected per BIP-173,
  uniform uppercase (QR form) accepted.
- Verified against the canonical BIP-173 P2WPKH vector, its uppercase
  variant, and the BIP-350 Taproot vector; rejects a flipped checksum
  char, mixed case, and a wrong hrp.

Also folds the three crypto validators added this branch
(btc_bech32/litecoin/eth) into the 200k-iteration fuzz set.

Eval: btc_bech32_address 400/400 recall; overall 44270/44400 = 99.7%
across 153 categories; 0 hard-negative violations (incl. a corrupted
bc1 and a wrong-EIP55-case line); clippy clean; lib tests 98/98.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds distinctive-prefix credential detectors. For a screen-capture
tool whose users are developers, an API key sitting on screen is a
far more frequent leak than an obscure national ID, and these are
zero-FP by construction (the vendor prefix is self-identifying):

  google_api_key (AIza), gitlab_pat (glpat-), npm_token (npm_),
  sendgrid_api_key (SG.x.y), slack_webhook_url, digitalocean_token
  (do[opr]_v1_), doppler_token (dp.<kind>.), linear_api_key
  (lin_api_), postman_api_key (PMAK-...-...), shopify_token (shp*_),
  stripe_webhook_secret (whsec_), square_oauth_token (sq0(atp|csp)-),
  databricks_token (dapi), age_secret_key (AGE-SECRET-KEY-1).

These complement the prefixless secrets already in raw[]
(OpenAI/Stripe/GitHub/Slack/AWS/Google-OAuth/HF/JWT). All are
SpanLabel::Secret, so they are always redacted regardless of the
per-category allow-list; the subtype only adds telemetry visibility.

Eval: each new category 400/400 recall; overall 49870/50000 = 99.7%
across 167 categories; 0 hard-negative violations; the only FP-scan
hits are assert! fixtures in the legacy screenpipe-core pii_removal
tests (true positives on example secrets, not real prose). New
cloud_credentials_caught unit test asserts each prefix matches a
synthetic instance and rejects near-misses. clippy clean, lib 99/99.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… 167 to 181

Second credential batch, several in screenpipe's own stack:
  github_fine_grained_pat (github_pat_), sentry_dsn, atlassian_api_token
  (ATATT3), hashicorp_vault_token (hv[sb].), figma_pat (fig[dur]_),
  new_relic_api_key (NRAK-), razorpay_key (rzp_(live|test)_),
  planetscale_token (pscale_*_), supabase_token (sbp_), tailscale_authkey
  (tskey-), flutterwave_secret (FLWSECK), fly_io_token (fo1_),
  notion_token (ntn_), pulumi_token (pul-).

Overlap-suppression fix (caught by the harness): a Sentry DSN's key
looks like an email local part, so the email detector matched the same
region. Suppression was strictly first-by-PATTERNS-index, so the
non-secret Email span won and the Secret sentry_dsn span was dropped.
Under the default secrets-only policy the Email span is then filtered
out, so the DSN key would leak. Fix: order candidates secret-first
before overlap suppression, so a credential never loses an overlap to a
non-secret. Within each tier the original priority holds; the
connection-string test still resolves to Url. New regression test
secret_wins_overlap_over_email locks it.

Eval: each new category 400/400 recall; overall 55472/55600 = 99.8%
across 181 categories (139 in the brute-force sweep); 0 hard-negative
violations. The one real-text hit is a genuine hardcoded Sentry DSN in
screenpipe-engine.rs (a true positive, not an FP). clippy clean, lib
101/101.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- xrp_address: Ripple classic address, Base58Check over the Ripple
  dictionary (refactored base58check to take the alphabet; Bitcoin and
  Ripple share the routine). Verified against the documented
  rHb9CJAWyB4rj91VRWn96DkukG4bwdtyTh and ACCOUNT_ZERO vectors; rejects a
  flipped tail and a Bitcoin address. The r-prefix shape is gated by the
  checksum, so the 8.4 MB source scan produced zero false matches.
- posthog_project_key (phc_), docker_hub_pat (dckr_pat_),
  slack_app_token (xapp-), pypi_token (pypi-) — distinctive-prefix
  credentials, SpanLabel::Secret.

Eval: each new category 400/400 recall; overall 57461/57600 = 99.8%
across 186 categories (144 in the brute-force sweep); 0 hard-negative
violations; 0 real-text false matches. clippy clean (all-targets),
lib 101/101, xrp_address added to the fuzz set.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Re-exports the rfdetr_v11 checkpoint at fp16 (rfdetr_v12.onnx) and points
the ONNX image redactor at it. The model is ~2.1x smaller (54 vs 114 MB)
and ~1.8x faster on CPU, with zero-leak within ~0.6pp of v11 fp32 on the
corpus eval (essentially lossless, since fp16 inference of an
fp32-trained detector preserves detections).

Re-export from the checkpoint (torch fp16 export + fp32 I/O wrapper) was
the only path that works. Post-hoc onnxruntime conversion of v11 fails:
dynamic int8 craters zero-leak 97.4% to 66.3%, static int8 costs 4.2pp
and runs slower, and fp16 won't even load (the DINOv2 backbone's internal
Cast nodes break the converter). Each was verified on
screenpipe-pii-bench-image before trusting it. Recipe saved at
training/export_fp16.py in that repo.

rfdetr_v12.onnx published to HF (screenpipe/pii-image-redactor); SHA
pinned and verified after download. v11 fp32 stays on HF as rollback.
The MLX path is unchanged (separate model file).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@louis030195 louis030195 merged commit 5aa7deb into main Jun 8, 2026
22 checks passed
louis030195 pushed a commit that referenced this pull request Jun 8, 2026
Ships the configurable structured-ID PII detectors (crypto/cloud/credentials/
national-ID, 150->186 categories) + the rfdetr image model fp16 (114MB->54MB)
from #3872.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant