Skip to content

test(ship-two-001): pin 45 AC_* constants to spec values (drift tripwire)#1046

Merged
noahgift merged 1 commit into
mainfrom
tests/ship-two-001-const-pinning
Apr 24, 2026
Merged

test(ship-two-001): pin 45 AC_* constants to spec values (drift tripwire)#1046
noahgift merged 1 commit into
mainfrom
tests/ship-two-001-const-pinning

Conversation

@noahgift

Copy link
Copy Markdown
Contributor

Summary

Adds one integration test file that imports every AC_* constant from the SHIP-TWO-001 verdict modules and asserts its value matches the spec. Catches silent const drift at cargo test time — before it reaches production.

Why

The 42 PARTIAL_ALGORITHM_LEVEL verdict functions shipped in #1044 all parameterize against pub const AC_* thresholds (AC_SHIP2_003_MAX_VAL_CROSS_ENTROPY_LOSS, AC_GATE_SHIP_011_MIN_PMAT_TDG_SCORE, etc.). A contributor editing one of these from 2.22.5 without touching the spec:

  • All verdict fns still compile ✓
  • All existing unit tests still pass ✓
  • Ship criterion silently loosened ✗

This file is the tripwire.

Coverage

Section Consts Tests
§4.2 MODEL-1 AC-SHIP1-001..010 17 17
§7.1 MODEL-1 SHIP-023/024 stability 3 3
§5.2 MODEL-2 AC-SHIP2-003..010 7 7
§6 Compound Ship Gates 001..012 10 10
§14 Task #132 GPUTRAIN 003..007 8 8
Total pinned 45 44 value + 1 tripwire

Verification

$ cargo test -p aprender-train --test ship_two_001_const_pinning
...
test result: ok. 44 passed; 0 failed; 0 ignored

How drift is caught

Example: someone pushes a commit changing AC_SHIP2_003_MAX_VAL_CROSS_ENTROPY_LOSS from 2.2 to 2.5 without updating docs/specifications/aprender-train/ship-two-models-spec.md:

$ cargo test ship2_003_val_ce_floor_is_2_2
thread 'ship2_003_val_ce_floor_is_2_2' panicked at
  'assertion `left == right` failed
   left: 2.5
  right: 2.2'
FAILED

The test name encodes the expected value — grep discovers the drift, the spec section, and the fix path in one step.

Relationship to other PRs

🤖 Generated with Claude Code

The 42 PARTIAL_ALGORITHM_LEVEL verdict functions shipped in #1044 all
parameterize against `pub const AC_*` thresholds. A contributor editing
`AC_SHIP2_003_MAX_VAL_CROSS_ENTROPY_LOSS` from `2.2` → `2.5` without
touching the spec would silently loosen the gate — every verdict fn
still compiles, every existing test still passes, but the ship criterion
has drifted.

This test file imports every `AC_*` const across the workspace (4 crates,
45 consts) and asserts its value matches the spec at §4.2 / §5.2 / §6 /
§7.1 / §14. 44 independent assertions + 1 tripwire comment-sync counter.

If the spec moves, these assertions must move with it. If someone edits
a const without updating the spec, this test fails at `cargo test` time,
not at next production regression.

Coverage:
- MODEL-1 §4.2 AC-SHIP1-001..010 + §7.1 SHIP-023/024  →  20 consts
- MODEL-2 §5.2 AC-SHIP2-003..010                      →   7 consts
- §6 Compound Ship Gates 001..012                      →  10 consts
- §14 Task #132 GPUTRAIN 003..007                      →   8 consts
- Total pinned                                         →  45 consts
- Total `#[test]` fns                                  →  45 (44 value + 1 tripwire)

`cargo test -p aprender-train --test ship_two_001_const_pinning`
→ 44 passed; 0 failed.

Independent of #1044 (which already merged to main). No cascade deps.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@noahgift noahgift enabled auto-merge (squash) April 24, 2026 14:35
@noahgift noahgift merged commit 3849ad0 into main Apr 24, 2026
11 checks passed
@noahgift noahgift deleted the tests/ship-two-001-const-pinning branch April 24, 2026 15:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant