feat(ship-discharges): 5 live-dispatch scripts for MODEL-1 PARTIAL discharges#1555
Merged
Merged
Conversation
…ns on 5 MODEL-1 PARTIAL discharges After M-FFN-GGUF-5 fix MERGED on aprender main 2026-05-07 (PR #1550 squash e856eb9), the §27 layer-3 ffn_swigl APR-vs-GGUF divergence is closed: live H1 CONFIRMED at layer-3 ratio 1.245× (was 18.23× pre-methodology-fix). 5 MODEL-1 PARTIAL discharges become live- dispatch-ready: SHIP-002, SHIP-005, SHIP-006, SHIP-007, SHIP-008. This PR adds evidence-pin annotations to each of the 3 contracts that hold those discharges, citing PR #1550 as upstream §22 blocker resolution. Pure additive YAML — no behavioral or test changes. Contracts touched (3 contracts × 5 ACs): - contracts/qwen2-e2e-verification-v1.yaml v1.10.0 → v1.11.0 (FALSIFY-QW2E-SHIP-002, FALSIFY-QW2E-SHIP-005, FALSIFY-QW2E-SHIP-007) - contracts/apr-model-qa-v1.yaml v1.2.0 → v1.3.0 (FALSIFY-QA-SHIP-006) - contracts/chat-template-v1.yaml v1.1.0 → v1.2.0 (GATE-CHAT-SHIP-008) Each contract's full_discharge_blocks_on clause now includes: "Upstream blocker SHIP-007 §22 RESOLVED 2026-05-07 (aprender PR #1550 squash e856eb9; M-FFN-GGUF-5 fix); live discharge is now dispatch- ready — no further upstream blockers." This is bookkeeping work that captures the cascade outcome in the contract surface so the next operator-dispatched LIVE-run session has the citation ready. Each individual discharge still requires its own LIVE run on RTX 4090 per the canonical command in full_discharge_blocks_on (apr run / apr eval / apr bench / apr qa). This PR does NOT promote PARTIAL_ALGORITHM_LEVEL → DISCHARGED — that needs the LIVE evidence files. Companion: scripts/ship-discharges/ship-XXX-discharge.sh dispatch scripts authored in parallel by sub-agent (separate PR). Test plan: - [x] pv validate contracts/qwen2-e2e-verification-v1.yaml → 0 errors - [x] pv validate contracts/apr-model-qa-v1.yaml → 0 errors - [x] pv validate contracts/chat-template-v1.yaml → 0 errors - [x] No code changes; production hot paths byte-unchanged Refs PMAT-CCPA, SHIP-007 §22, M-FFN-GGUF-5 PR #1550. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…scharges After SHIP-007 §22 upstream blocker resolved (PR #1550 merged 2026-05-07), SHIP-002/005/006/007/008 are LIVE-dispatch-ready. Each script runs the canonical command from its contract's `full_discharge_blocks_on:` clause, parses output, emits evidence JSON, and prints Pass/Fail verdict. Scripts (978 LOC total, all bashrs lint clean): - ship-002-discharge.sh — `apr run` + python AST parse, 0 syntax errors - ship-005-discharge.sh — 3 HumanEval runs (seed=0), median pass@1 ≥ 86.00% (or ≥ 84.80% with 1.2 pp noise allowance) - ship-006-discharge.sh — `apr qa --json`, all 8 gates pass - ship-007-discharge.sh — `apr bench`, median ≥ 30.0 tok/s on RTX 4090 - ship-008-discharge.sh — `apr run --print-prompt`, byte-exact ChatML golden Each script: - Defaults to /mnt/nvme-raid0/targets/aprender/release/apr (lambda-vector canonical), accepts --apr-binary and --model overrides - Writes canonical evidence/ship-XXX-full-discharge/discharge-evidence-v1.json matching the format used by SHIP-001/003/004 (already DISCHARGED) - Exits 0 on Pass, 1 on Fail; preflight rejects bad apr-binary / missing jq - Strict shell hygiene: set -euo pipefail, quoted vars, mktemp with EXIT trap .bashrsignore updated with audited SEC001 suppression — false positive on the literal substring "eval" in `apr eval` (apr-cli HumanEval subcommand, not the bash `eval` builtin). Includes top-level README.md documenting the dispatch matrix, operator workflow, and prerequisites. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
1aca028 to
1021a67
Compare
6 tasks
noahgift
added a commit
that referenced
this pull request
May 7, 2026
…scharges (#1555) * docs(contracts): SHIP-007 §22 upstream blocker RESOLVED — evidence pins on 5 MODEL-1 PARTIAL discharges After M-FFN-GGUF-5 fix MERGED on aprender main 2026-05-07 (PR #1550 squash e856eb9), the §27 layer-3 ffn_swigl APR-vs-GGUF divergence is closed: live H1 CONFIRMED at layer-3 ratio 1.245× (was 18.23× pre-methodology-fix). 5 MODEL-1 PARTIAL discharges become live- dispatch-ready: SHIP-002, SHIP-005, SHIP-006, SHIP-007, SHIP-008. This PR adds evidence-pin annotations to each of the 3 contracts that hold those discharges, citing PR #1550 as upstream §22 blocker resolution. Pure additive YAML — no behavioral or test changes. Contracts touched (3 contracts × 5 ACs): - contracts/qwen2-e2e-verification-v1.yaml v1.10.0 → v1.11.0 (FALSIFY-QW2E-SHIP-002, FALSIFY-QW2E-SHIP-005, FALSIFY-QW2E-SHIP-007) - contracts/apr-model-qa-v1.yaml v1.2.0 → v1.3.0 (FALSIFY-QA-SHIP-006) - contracts/chat-template-v1.yaml v1.1.0 → v1.2.0 (GATE-CHAT-SHIP-008) Each contract's full_discharge_blocks_on clause now includes: "Upstream blocker SHIP-007 §22 RESOLVED 2026-05-07 (aprender PR #1550 squash e856eb9; M-FFN-GGUF-5 fix); live discharge is now dispatch- ready — no further upstream blockers." This is bookkeeping work that captures the cascade outcome in the contract surface so the next operator-dispatched LIVE-run session has the citation ready. Each individual discharge still requires its own LIVE run on RTX 4090 per the canonical command in full_discharge_blocks_on (apr run / apr eval / apr bench / apr qa). This PR does NOT promote PARTIAL_ALGORITHM_LEVEL → DISCHARGED — that needs the LIVE evidence files. Companion: scripts/ship-discharges/ship-XXX-discharge.sh dispatch scripts authored in parallel by sub-agent (separate PR). Test plan: - [x] pv validate contracts/qwen2-e2e-verification-v1.yaml → 0 errors - [x] pv validate contracts/apr-model-qa-v1.yaml → 0 errors - [x] pv validate contracts/chat-template-v1.yaml → 0 errors - [x] No code changes; production hot paths byte-unchanged Refs PMAT-CCPA, SHIP-007 §22, M-FFN-GGUF-5 PR #1550. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(ship-discharges): 5 live-dispatch scripts for MODEL-1 PARTIAL discharges After SHIP-007 §22 upstream blocker resolved (PR #1550 merged 2026-05-07), SHIP-002/005/006/007/008 are LIVE-dispatch-ready. Each script runs the canonical command from its contract's `full_discharge_blocks_on:` clause, parses output, emits evidence JSON, and prints Pass/Fail verdict. Scripts (978 LOC total, all bashrs lint clean): - ship-002-discharge.sh — `apr run` + python AST parse, 0 syntax errors - ship-005-discharge.sh — 3 HumanEval runs (seed=0), median pass@1 ≥ 86.00% (or ≥ 84.80% with 1.2 pp noise allowance) - ship-006-discharge.sh — `apr qa --json`, all 8 gates pass - ship-007-discharge.sh — `apr bench`, median ≥ 30.0 tok/s on RTX 4090 - ship-008-discharge.sh — `apr run --print-prompt`, byte-exact ChatML golden Each script: - Defaults to /mnt/nvme-raid0/targets/aprender/release/apr (lambda-vector canonical), accepts --apr-binary and --model overrides - Writes canonical evidence/ship-XXX-full-discharge/discharge-evidence-v1.json matching the format used by SHIP-001/003/004 (already DISCHARGED) - Exits 0 on Pass, 1 on Fail; preflight rejects bad apr-binary / missing jq - Strict shell hygiene: set -euo pipefail, quoted vars, mktemp with EXIT trap .bashrsignore updated with audited SEC001 suppression — false positive on the literal substring "eval" in `apr eval` (apr-cli HumanEval subcommand, not the bash `eval` builtin). Includes top-level README.md documenting the dispatch matrix, operator workflow, and prerequisites. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
After SHIP-007 §22 upstream blocker was resolved (aprender PR #1550 merged 2026-05-07), 5 MODEL-1 PARTIAL discharges (SHIP-002/005/006/007/008) became LIVE-dispatch-ready. This PR ships 5 bash scripts in
scripts/ship-discharges/so an operator can flip each contract'sdischarge_status: PARTIAL_ALGORITHM_LEVEL→DISCHARGEDby running one command per gate.This PR also includes 1 already-merged commit on the branch (
f3a5a2be5) that adds the upstream-blocker evidence pins to the 3 contract YAMLs.What each script does
Each script:
full_discharge_blocks_on:clauseevidence/ship-XXX-full-discharge/discharge-evidence-v1.json(matches the format used by already-DISCHARGED SHIP-001/003/004)ship-002-discharge.shqwen2-e2e-verification-v1.yamlFALSIFY-QW2E-SHIP-002def fib(n):completion parses, 0 syntax errors via ruff/rustpython/python3ship-005-discharge.shqwen2-e2e-verification-v1.yamlFALSIFY-QW2E-SHIP-005ship-006-discharge.shapr-model-qa-v1.yamlFALSIFY-QA-SHIP-006apr qagates report\"pass\": trueship-007-discharge.shqwen2-e2e-verification-v1.yamlFALSIFY-QW2E-SHIP-007ship-008-discharge.shchat-template-v1.yamlGATE-CHAT-SHIP-008Each accepts
--apr-binaryand--modeloverrides; defaults to/mnt/nvme-raid0/targets/aprender/release/apr(lambda-vector canonical) and the canonical 7B teacher.Files added
scripts/ship-discharges/ship-002-discharge.sh(178 LOC)scripts/ship-discharges/ship-005-discharge.sh(184 LOC)scripts/ship-discharges/ship-006-discharge.sh(148 LOC)scripts/ship-discharges/ship-007-discharge.sh(175 LOC)scripts/ship-discharges/ship-008-discharge.sh(189 LOC)scripts/ship-discharges/README.md(105 LOC) — dispatch matrix + operator workflow.bashrsignore— adds audited SEC001 suppression for the literal substring "eval" inapr eval(apr-cli HumanEval subcommand, not the bashevalbuiltin)Total: 987 LOC additive. No source code or test changes.
Quality gates
bash -nsyntax checkbashrs lintwith 0 errors (warnings are style hints, no blockers)shellcheck -S error--helpsmoke test (prints comment header)Test plan
noah-Lambda-Vector(RTX 4090, lambda-labs):bash scripts/ship-discharges/ship-002-discharge.sh(and 005/006/007/008)evidence/ship-XXX-full-discharge/discharge-evidence-v1.jsondischarge_status: PARTIAL_ALGORITHM_LEVEL→DISCHARGEDDISCHARGEDstatus (mirrors SHIP-001/003/004 pattern)Notes
The scripts are NOT actually run in this PR — they're for the operator to dispatch on the canonical lambda-labs RTX 4090 host. The PR's purpose is to ship the dispatch surface.
🤖 Generated with Claude Code