feat(aprender-core): apr-cli-qa-v1 QA-001 PARTIAL_ALGORITHM_LEVEL by noahgift · Pull Request #1172 · paiml/aprender

noahgift · 2026-04-30T14:44:30Z

Summary

Algorithm-level PARTIAL discharge for FALSIFY-QA-001 (every registered apr command exits 0 on --help) per contracts/apr-cli-qa-v1.yaml.
New module crates/aprender-core/src/format/qa_001.rs exporting verdict_from_help_subcommand_smoke(commands_tested, commands_passing_help_smoke) -> Qa001Verdict.
Pinned constant: AC_QA_001_EXPECTED_COMMAND_COUNT = 58 (canonical apr CLI surface per spec §26.8 / CLAUDE.md).
Pass iff: tested == 58 AND passing == tested AND passing <= tested.
13 unit tests across 7 mutation-survey sections.

Why two-counter shape

A single passing == 58 check would silently pass "60 commands registered, 58 happen to pass --help, 2 broken" — registry-size drift class. Tracking commands_tested separately catches phantom (59) AND dropped (57) subcommands AND broken-dispatch (passing < tested) — three distinct regression classes.

Why pin 58

Per spec §26.8 / CLAUDE.md: 58 subcommands (57 original + mcp PR #864 2026-04-17). Adding a fourth enforcement layer to the existing 3-surface-drift trio (registered_commands test, apr-cli-commands-v1.yaml, CLAUDE.md).

Five-Whys (commit-message body has full chain)

Bind now → phantom/broken-dispatch ships invisibly.
(u64, u64) pair → algorithm-level pin.
Pin 58 → catches add/drop drift.
Two counters → distinguishes registry-size from dispatch-broken.
13 tests → 7 sections incl. composite property test.

Test plan

cargo test -p aprender-core --lib qa_001 → 13 passed
PMAT pre-commit gates pass

🤖 Generated with Claude Code

Algorithm-level PARTIAL discharge for FALSIFY-QA-001 (every registered apr command exits 0 on --help) per `contracts/apr-cli-qa-v1.yaml`. ## What this binds `verdict_from_help_subcommand_smoke(commands_tested, commands_passing_help_smoke)` returns Pass iff: 1. `commands_tested == 58` (canonical apr CLI surface) 2. `commands_passing_help_smoke == commands_tested` (every command's --help exited 0) 3. `commands_passing_help_smoke <= commands_tested` (counter sanity) Pinned constant: `AC_QA_001_EXPECTED_COMMAND_COUNT = 58`. ## Why pin 58 commands explicitly Per spec §26.8 / CLAUDE.md: the canonical apr CLI surface is 58 subcommands (57 original + `mcp` added in PR #864 on 2026-04-17). A regression that adds or drops a subcommand without bumping the contract trips the gate three different ways: - `tested == 59`: phantom subcommand added (caught by `fail_phantom_subcommand_added_57_to_59`) - `tested == 57`: subcommand dropped (caught by `fail_subcommand_dropped`) - `passing < tested`: dispatch broken (caught by `fail_one_command_broken`) ## Why two-counter shape (tested + passing) A single `passing == 58` check would conflate two distinct regression classes: - "Registry has 58 commands and all dispatch" (true Pass) - "Registry has 60 commands, 58 happen to pass --help, but 2 are broken" (silent Pass under single-counter) Tracking `commands_tested` separately catches the second class. Section 4's `fail_phantom_subcommand_added_57_to_59` test exercises exactly this regression shape — 59 commands all pass --help, but the count drift is the actual defect. ## Five-Whys 1. Why bind QA-001 now? — `apr --help` smoke is the most user-facing CLI gate; without a verdict pin, a phantom subcommand or broken dispatch ships invisibly until users hit "command not found". 2. Why a (u64, u64) pair? — Algorithm-level pin; the actual per-subcommand `--help` invocation harness is FULL_DISCHARGE. 3. Why pin 58? — Catches add/drop drift; matches CLAUDE.md canonical claim. 4. Why two counters (not one)? — Distinguishes "registry size drift" from "dispatch broken" — both must Fail but for different reasons. 5. Why 13 tests across 7 sections? — Provenance pin (×1), pass band (×1: canonical 58/58), broken-dispatch fail (×3), count-drift fail (×3: phantom, dropped, zero), partition violations (×2), passing-sweep at 58 (×8), tested-sweep when perfect (×7), composite property test (×7 cells). ## Cross-reference Note the "tested = 58" pin should be kept in sync with: - `crates/apr-cli/tests/cli_commands.rs::registered_commands()` - `contracts/apr-cli-commands-v1.yaml` - CLAUDE.md "58 subcommands" claim Per `feedback_cli_subcommand_three_surface_drift` memory, all three surfaces must update together when adding a new command. QA-001's verdict pin adds a fourth enforcement layer. ## Scope PARTIAL_ALGORITHM_LEVEL only. Wiring this into the actual `registered_commands` test harness that produces the `(commands_tested, commands_passing_help_smoke)` pair is FULL_DISCHARGE work for the qa-pipeline implementation PR. ## Tests 13 unit tests, all green.

noahgift enabled auto-merge (squash) April 30, 2026 14:44

noahgift merged commit 9694484 into main Apr 30, 2026
11 checks passed

noahgift deleted the feat/qa-001-partial-discharge branch April 30, 2026 15:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(aprender-core): apr-cli-qa-v1 QA-001 PARTIAL_ALGORITHM_LEVEL#1172

feat(aprender-core): apr-cli-qa-v1 QA-001 PARTIAL_ALGORITHM_LEVEL#1172
noahgift merged 1 commit into
mainfrom
feat/qa-001-partial-discharge

noahgift commented Apr 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

noahgift commented Apr 30, 2026

Summary

Why two-counter shape

Why pin 58

Five-Whys (commit-message body has full chain)

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant