Skip to content

feat(falsify-ship-016): PARTIAL discharge — apr qa 8-of-8 aggregate-AND verdict fn#1008

Closed
noahgift wants to merge 1 commit into
mainfrom
feat/falsify-ship-016-partial-discharge
Closed

feat(falsify-ship-016): PARTIAL discharge — apr qa 8-of-8 aggregate-AND verdict fn#1008
noahgift wants to merge 1 commit into
mainfrom
feat/falsify-ship-016-partial-discharge

Conversation

@noahgift

Copy link
Copy Markdown
Contributor

Summary

Wires GATE-ARCH-370M-008 (AC-SHIP2-006, FALSIFY-SHIP-016) to a pure
verdict_from_qa_gates(&[bool]) -> Ship016Verdict aggregate-AND fn in
crates/aprender-train/src/models/llama_370m.rs, proven today by:

  • Exhaustive 2^8 = 256-combination sweep — exactly one input (all-true) yields Pass; the other 255 yield Fail
  • Single-gate-flip falsifiability — flipping any one gate true→false flips aggregate to Fail (8 cases)
  • Monotonicity — flipping false→true never regresses Pass→Fail (pairwise sweep)
  • 3 contract-drift guards — slice length 0/7/9/16 → Fail conservatively even when all-true
  • Provenance pinAC_SHIP2_006_REQUIRED_QA_GATE_COUNT const == 8 matches spec §7

discharge_status: PARTIAL_ALGORITHM_LEVEL. Full discharge blocks on real 370M .apr + 8-gate apr qa harness invocation (fixture-swap only, no harness rewrite).

Pattern note

SHIP-016 is the first aggregate-AND shape — SHIP-017/018/020 were single-threshold shapes. The proof pattern now covers two distinct decision-rule shapes (threshold + aggregate-AND), confirming decision-rule/compute-harness separation is a reusable pattern, not a one-off.

5th PARTIAL — "exhausted" verdict now falsified 4×

Chain: SHIP-019 → SHIP-017 → SHIP-020 → SHIP-018 → SHIP-016. Each was found after a prior "search exhausted" verdict. Lesson: re-run counter-example survey before declaring search space exhausted.

MODEL-2 ship-gate coverage

Status Count Gates
ACTIVE 3/12 001, 011, 012
PARTIAL 7/12 002 (SHIP-012), 005 (SHIP-015), 006 (SHIP-016, new), 007 (SHIP-017), 008 (SHIP-018), 009 (SHIP-019), 010 (SHIP-020)
Truly compute-blocked 2/12 003 (CE ≤ 2.2 val loss), 004 (≤21-day wall-clock)

10/12 touched (83.3%) — up from 9/12 (75.0%) after SHIP-018.

Files changed

  • contracts/model-families/llama-370m-sovereign-v1.yaml v1.5.0 → v1.6.0 (GATE-ARCH-370M-008 block added; stays ACTIVE)
  • crates/aprender-train/src/models/llama_370m.rs — const + enum + pure fn + 2 tests
  • docs/specifications/aprender-train/ship-two-models-spec.md v2.23.0 → v2.24.0 (amendment block)
  • crates/aprender-train/src/train/device.rs — pre-existing fmt fixes bundled per Toyota Way

Test plan

  • cargo test -p aprender-train --lib models::llama_370m — 12/12 PASS including 2 new SHIP-016 tests
  • cargo fmt -p aprender-train --check — clean
  • cargo clippy -p aprender-train --lib -- -D warnings — clean
  • pv validate contracts/model-families/llama-370m-sovereign-v1.yaml — 0 errors, 0 warnings
  • CI ci / gate + workspace-test green

Refs #152

🤖 Generated with Claude Code

…ND verdict fn

Wires GATE-ARCH-370M-008 (AC-SHIP2-006) to a pure
verdict_from_qa_gates(&[bool]) -> Ship016Verdict aggregate-AND fn in
aprender-train/src/models/llama_370m.rs, proven today by exhaustive
2^8 = 256-combination sweep + single-gate-flip falsifiability +
monotonicity + 3 contract-drift guards (slice length 0/7/9/16 → Fail
even when all-true). Discharge marker: PARTIAL_ALGORITHM_LEVEL.

Pattern note: SHIP-016 is the first aggregate-AND shape —
SHIP-017/018/020 were single-threshold shapes. The proof pattern now
covers two distinct decision-rule shapes, confirming
decision-rule/compute-harness separation is a reusable pattern, not a
one-off.

**5th PARTIAL after "exhausted" verdict falsified 4× already**
(SHIP-019 → SHIP-017 → SHIP-020 → SHIP-018 → SHIP-016).

**MODEL-2 ship-gate coverage: 3/12 ACTIVE + 7/12 PARTIAL = 10/12
touched (83.3%).** Remaining 2 truly compute-blocked (003 CE ≤ 2.2,
004 ≤21-day wall-clock) have no fixture-swap trick.

Changes:
- contracts/model-families/llama-370m-sovereign-v1.yaml v1.5.0 → v1.6.0
  (GATE-ARCH-370M-008 block added; stays ACTIVE)
- crates/aprender-train/src/models/llama_370m.rs:
  + AC_SHIP2_006_REQUIRED_QA_GATE_COUNT = 8 const
  + Ship016Verdict enum
  + verdict_from_qa_gates(&[bool]) pure fn with aggregate-AND
  + falsify_ship_016_apr_qa_aggregate_and_logic test (2^8 sweep +
    single-gate-flip + monotonicity + 3 contract-drift guards)
  + falsify_ship_016_gate_arch_370m_008_has_partial_discharge_marker
    test (YAML binding: binds_to AC-SHIP2-006, falsification_id
    FALSIFY-SHIP-016, discharge_status PARTIAL_ALGORITHM_LEVEL)
- docs/specifications/aprender-train/ship-two-models-spec.md v2.23.0
  → v2.24.0 (amendment block documenting 5th PARTIAL, first
  aggregate-AND shape)
- crates/aprender-train/src/train/device.rs: pre-existing fmt fixes
  bundled per Toyota Way "all defects are your defects"

Full discharge blocks on: real 370M .apr from AC-SHIP2-003/004
compute-dispatch + 8-gate apr qa harness invocation with exit 0 →
feed the 8 gate-result booleans into verdict_from_qa_gates and
require Ship016Verdict::Pass. Fixture-swap only — no harness rewrite.

Refs #152

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
noahgift added a commit that referenced this pull request Apr 22, 2026
13 workspace Cargo.toml edits converting pinned crates.io
versions of renacer, entrenar, entrenar-lora, batuta, and
trueno-graph to `{ workspace = true }` aliases pointing at
the in-tree crates (aprender-profile, aprender-train,
aprender-orchestrate, aprender-graph).

Root cause: after APR-MONO Phase 2/3 consolidation, several
crates still pinned old crates.io sibling versions. Each
carried a transitive dep on an older `aprender` (0.14.1,
0.27.8), producing a diamond that broke
`cargo clippy -p aprender` with an ambiguous-package error.

Post-fix metadata:
  aprender: 1 package (workspace 0.31.2)
  renacer / entrenar / trueno-graph: 0 crates.io packages

Unblocks main-andon (required CI checks) and PRs #1008
(FALSIFY-SHIP-016 PARTIAL) and #1009 (FALSIFY-SHIP-009 PARTIAL).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
noahgift added a commit that referenced this pull request Apr 22, 2026
* fix(apr-mono): eliminate sibling-pin diamond dep (muda)

13 workspace Cargo.toml edits converting pinned crates.io
versions of renacer, entrenar, entrenar-lora, batuta, and
trueno-graph to `{ workspace = true }` aliases pointing at
the in-tree crates (aprender-profile, aprender-train,
aprender-orchestrate, aprender-graph).

Root cause: after APR-MONO Phase 2/3 consolidation, several
crates still pinned old crates.io sibling versions. Each
carried a transitive dep on an older `aprender` (0.14.1,
0.27.8), producing a diamond that broke
`cargo clippy -p aprender` with an ambiguous-package error.

Post-fix metadata:
  aprender: 1 package (workspace 0.31.2)
  renacer / entrenar / trueno-graph: 0 crates.io packages

Unblocks main-andon (required CI checks) and PRs #1008
(FALSIFY-SHIP-016 PARTIAL) and #1009 (FALSIFY-SHIP-009 PARTIAL).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(ci-lint): unblock workspace clippy exposed by sibling-alias activation

The sibling-alias MUDA fix in 3f38207 activates crates that were never
clippy-gated under main's red CI. This commit adds the minimum lint-unblock
changes needed for `cargo clippy --all-targets -- -D warnings` to pass.

Code fixes:
- aprender-compute GPU path (wgsl_forward, cached_matmul, dispatch, parallel):
  underscore-prefix unused vars, `.unwrap()` → `.expect()` with descriptive
  messages, collapsible_if, doc_lazy_continuation, allow map_entry, drop two
  bogus `contract_post_*!()` macro calls (type-mismatched, never compiled).
- aprender-graph/shortest_path.rs: add `#[must_use]` + doc on dijkstra_path,
  `.map_or(false, |&d| cost > d)` → `.is_some_and(|&d| cost > d)`.
- aprender-graph/pattern.rs: `mapping.iter()` → `mapping`.

Crate-level lint allows (pedantic lints carried over from pre-monorepo crates
— to be fixed per-lint in dedicated follow-up PRs):
- aprender-profile/src/lib.rs: 9 pedantic lints (renacer legacy).
- apr-cli/tests/*.rs (13 files): `disallowed_methods` etc for integration
  tests where `unwrap()` is idiomatic.
- apr-cli/examples/tool_calling_demo.rs: `disallowed_methods` for serde_json
  `json!` macro expansion.

Dead code (MUDA-aligned cleanup):
- Delete apr-cli/examples/probar_tui_testing.rs + federation_tui_demo.rs —
  both reference the removed `ratatui` crate and deleted `federation::tui`
  module.
- Drop matching `[[example]]` entries from apr-cli/Cargo.toml.

Verification:
  cargo clippy --all-targets -- -D warnings  # clean in 36s

Ported from closed PR #1010 (commit 4338cbf on fix/apr-mono-muda). Keeps
#1011's Cargo.toml changes intact; no conflicts with the MUDA fix commit.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
@noahgift

Copy link
Copy Markdown
Contributor Author

Superseded by #1035 — rebased onto MODEL-1 stack + SHIP-017 + SHIP-020 + SHIP-018 at v2.37.0.

@noahgift noahgift closed this Apr 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant