Skip to content

fix(apr-qa): strengthen Golden Output gate with statistical gibberish detection#1463

Merged
noahgift merged 1 commit into
mainfrom
fix/qa-golden-output-gibberish-detection
May 4, 2026
Merged

fix(apr-qa): strengthen Golden Output gate with statistical gibberish detection#1463
noahgift merged 1 commit into
mainfrom
fix/qa-golden-output-gibberish-detection

Conversation

@noahgift

@noahgift noahgift commented May 4, 2026

Copy link
Copy Markdown
Contributor

Summary

  • apr qa Golden Output gate had a hardcoded 4-string garbage list (["\u{FFFD}", "[UNK]", "akunji", "olumbia"]) that missed every new gibberish class
  • Added detect_gibberish with 3 statistical signals (non-ASCII ratio > 60%, 4+ byte fragment repeated 3+ times, U+FFFD density)
  • Captured Qwen2-0.5B observed gibberish ("udaÅĤo", "ëĸ»", "zwiÄħzku") as drift-prevention test cases

Five Whys (in commit msg ddb6a1a)

  1. Why did defective Qwen2-0.5B-Instruct inference ship undetected? apr qa Golden Output PASSED.
  2. Why did Golden Output PASS gibberish? verify_output garbage check used 4 hardcoded strings.
  3. Why hardcoded? Patterns came from one-off past incidents (LAYOUT-002 "olumbia") and were never generalized.
  4. Why never generalized? No statistical sanity signal (non-ASCII ratio, repeated-fragment, FFFD density).
  5. Why dangerous? New gibberish classes sail through — the gate became a falsifying liar.

Honest scope

  • ✅ Strengthens the GATE so future regressions of this class FAIL
  • ✅ 14/14 unit tests pass, including 3 new ones using real captured gibberish
  • ⚠️ Does NOT yet flip the existing 0.5B apr qa run because the gate's internal 512-token generation produces different (apparently sufficiently-ASCII) output than the manual 16-token test that captured the gibberish
  • ⚠️ Underlying Qwen2-0.5B short-prompt inference defect still exists; downstream bisection PR follows now that the gate can no longer hide it

Test plan

  • `cargo test -p apr-cli --lib --features inference verify_output` — 14/14 pass
  • `cargo build --release -p apr-cli --features cuda --bin apr` — clean
  • Pre-commit quality gates green (per local hook)
  • CI `ci / gate` and `workspace-test` (pending)

🤖 Generated with Claude Code

@noahgift noahgift enabled auto-merge (squash) May 4, 2026 09:19
… detection

Root cause (Five Whys):
1. Defective Qwen2-0.5B-Instruct inference (CJK/Polish/diacritic byte fragments
   like "udaÅĤo", "ëĸ»", "zwiÄħzku") shipped because `apr qa` PASSED.
2. Why did Golden Output PASS gibberish? `verify_output` garbage-pattern check
   was hardcoded to 4 strings: ["\u{FFFD}", "[UNK]", "akunji", "olumbia"].
3. Why hardcoded? Patterns came from specific past incidents (LAYOUT-002 "olumbia"
   garbage) and were never generalized.
4. Why never generalized? No statistical sanity signal — non-ASCII ratio,
   repeated-fragment detection, U+FFFD density.
5. Why dangerous? Any new gibberish class sails through. The QA gate became a
   liar — passing models that produce unintelligible non-ASCII output.

Fix: add `detect_gibberish` with three statistical signals, ANY trip rejects:
- Non-ASCII ratio > 60% over 16+ chars (English/code outputs are ASCII-heavy)
- 4+ byte fragment repeated 3+ times consecutively (BPE/loop pathology)
- U+FFFD density > 1 per 32 chars (UTF-8 decode failures)

Tests added (all passing):
- verify_output_rejects_qwen2_05b_observed_gibberish — captured real string
- verify_output_rejects_repeated_fragment — "udaÅĤo udaÅĤo udaÅĤo udaÅĤo end"
- verify_output_accepts_normal_english — regression guard

Caveat: this PR strengthens the GATE, not the underlying inference defect.
Qwen2-0.5B short-prompt inference still produces gibberish; downstream PR
will bisect that defect now that the gate can no longer hide it.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@noahgift noahgift force-pushed the fix/qa-golden-output-gibberish-detection branch from ddb6a1a to b8f5eb7 Compare May 4, 2026 09:51
@noahgift noahgift merged commit 1e2b116 into main May 4, 2026
10 checks passed
@noahgift noahgift deleted the fix/qa-golden-output-gibberish-detection branch May 4, 2026 10:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant