Skip to content

feat(ship-discharges): 5 live-dispatch scripts for MODEL-1 PARTIAL discharges#1555

Merged
noahgift merged 2 commits into
mainfrom
feat/ship-discharges-5-live-dispatch-scripts
May 7, 2026
Merged

feat(ship-discharges): 5 live-dispatch scripts for MODEL-1 PARTIAL discharges#1555
noahgift merged 2 commits into
mainfrom
feat/ship-discharges-5-live-dispatch-scripts

Conversation

@noahgift

@noahgift noahgift commented May 7, 2026

Copy link
Copy Markdown
Contributor

Summary

After SHIP-007 §22 upstream blocker was resolved (aprender PR #1550 merged 2026-05-07), 5 MODEL-1 PARTIAL discharges (SHIP-002/005/006/007/008) became LIVE-dispatch-ready. This PR ships 5 bash scripts in scripts/ship-discharges/ so an operator can flip each contract's discharge_status: PARTIAL_ALGORITHM_LEVELDISCHARGED by running one command per gate.

This PR also includes 1 already-merged commit on the branch (f3a5a2be5) that adds the upstream-blocker evidence pins to the 3 contract YAMLs.

What each script does

Each script:

  1. Runs the canonical command from its contract's full_discharge_blocks_on: clause
  2. Parses output to determine Pass/Fail
  3. Writes evidence to evidence/ship-XXX-full-discharge/discharge-evidence-v1.json (matches the format used by already-DISCHARGED SHIP-001/003/004)
  4. Prints Pass/Fail and exits 0 / 1
Script Contract Falsification Pass criterion
ship-002-discharge.sh qwen2-e2e-verification-v1.yaml FALSIFY-QW2E-SHIP-002 def fib(n): completion parses, 0 syntax errors via ruff/rustpython/python3
ship-005-discharge.sh qwen2-e2e-verification-v1.yaml FALSIFY-QW2E-SHIP-005 Median HumanEval pass@1 ≥ 86.00% (3 runs, seed=0) — accepts ≥ 84.80% within 1.2 pp noise allowance
ship-006-discharge.sh apr-model-qa-v1.yaml FALSIFY-QA-SHIP-006 All 8 apr qa gates report \"pass\": true
ship-007-discharge.sh qwen2-e2e-verification-v1.yaml FALSIFY-QW2E-SHIP-007 Median decode ≥ 30.0 tok/s across 5 iterations on RTX 4090
ship-008-discharge.sh chat-template-v1.yaml GATE-CHAT-SHIP-008 Rendered ChatML prompt byte-exact to spec golden

Each accepts --apr-binary and --model overrides; defaults to /mnt/nvme-raid0/targets/aprender/release/apr (lambda-vector canonical) and the canonical 7B teacher.

Files added

  • scripts/ship-discharges/ship-002-discharge.sh (178 LOC)
  • scripts/ship-discharges/ship-005-discharge.sh (184 LOC)
  • scripts/ship-discharges/ship-006-discharge.sh (148 LOC)
  • scripts/ship-discharges/ship-007-discharge.sh (175 LOC)
  • scripts/ship-discharges/ship-008-discharge.sh (189 LOC)
  • scripts/ship-discharges/README.md (105 LOC) — dispatch matrix + operator workflow
  • .bashrsignore — adds audited SEC001 suppression for the literal substring "eval" in apr eval (apr-cli HumanEval subcommand, not the bash eval builtin)

Total: 987 LOC additive. No source code or test changes.

Quality gates

  • All 5 scripts pass bash -n syntax check
  • All 5 scripts pass bashrs lint with 0 errors (warnings are style hints, no blockers)
  • All 5 scripts pass shellcheck -S error
  • All 5 scripts pass --help smoke test (prints comment header)
  • Preflight rejects bogus flags / missing apr binary with exit 1

Test plan

  • Operator runs each script on noah-Lambda-Vector (RTX 4090, lambda-labs):
    bash scripts/ship-discharges/ship-002-discharge.sh (and 005/006/007/008)
  • Each script writes evidence/ship-XXX-full-discharge/discharge-evidence-v1.json
  • Operator updates each contract YAML's discharge_status: PARTIAL_ALGORITHM_LEVELDISCHARGED
  • Operator amends contract changelog with the discharge session metadata
  • Drift-prevention test asserts DISCHARGED status (mirrors SHIP-001/003/004 pattern)

Notes

The scripts are NOT actually run in this PR — they're for the operator to dispatch on the canonical lambda-labs RTX 4090 host. The PR's purpose is to ship the dispatch surface.

🤖 Generated with Claude Code

@noahgift noahgift enabled auto-merge (squash) May 7, 2026 06:54
noahgift and others added 2 commits May 7, 2026 09:23
…ns on 5 MODEL-1 PARTIAL discharges

After M-FFN-GGUF-5 fix MERGED on aprender main 2026-05-07 (PR #1550
squash e856eb9), the §27 layer-3 ffn_swigl APR-vs-GGUF divergence
is closed: live H1 CONFIRMED at layer-3 ratio 1.245× (was 18.23×
pre-methodology-fix). 5 MODEL-1 PARTIAL discharges become live-
dispatch-ready: SHIP-002, SHIP-005, SHIP-006, SHIP-007, SHIP-008.

This PR adds evidence-pin annotations to each of the 3 contracts that
hold those discharges, citing PR #1550 as upstream §22 blocker
resolution. Pure additive YAML — no behavioral or test changes.

Contracts touched (3 contracts × 5 ACs):
- contracts/qwen2-e2e-verification-v1.yaml v1.10.0 → v1.11.0
  (FALSIFY-QW2E-SHIP-002, FALSIFY-QW2E-SHIP-005, FALSIFY-QW2E-SHIP-007)
- contracts/apr-model-qa-v1.yaml v1.2.0 → v1.3.0
  (FALSIFY-QA-SHIP-006)
- contracts/chat-template-v1.yaml v1.1.0 → v1.2.0
  (GATE-CHAT-SHIP-008)

Each contract's full_discharge_blocks_on clause now includes:
"Upstream blocker SHIP-007 §22 RESOLVED 2026-05-07 (aprender PR #1550
squash e856eb9; M-FFN-GGUF-5 fix); live discharge is now dispatch-
ready — no further upstream blockers."

This is bookkeeping work that captures the cascade outcome in the
contract surface so the next operator-dispatched LIVE-run session
has the citation ready. Each individual discharge still requires its
own LIVE run on RTX 4090 per the canonical command in
full_discharge_blocks_on (apr run / apr eval / apr bench / apr qa).
This PR does NOT promote PARTIAL_ALGORITHM_LEVEL → DISCHARGED — that
needs the LIVE evidence files.

Companion: scripts/ship-discharges/ship-XXX-discharge.sh dispatch
scripts authored in parallel by sub-agent (separate PR).

Test plan:
- [x] pv validate contracts/qwen2-e2e-verification-v1.yaml → 0 errors
- [x] pv validate contracts/apr-model-qa-v1.yaml → 0 errors
- [x] pv validate contracts/chat-template-v1.yaml → 0 errors
- [x] No code changes; production hot paths byte-unchanged

Refs PMAT-CCPA, SHIP-007 §22, M-FFN-GGUF-5 PR #1550.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…scharges

After SHIP-007 §22 upstream blocker resolved (PR #1550 merged 2026-05-07),
SHIP-002/005/006/007/008 are LIVE-dispatch-ready. Each script runs the
canonical command from its contract's `full_discharge_blocks_on:` clause,
parses output, emits evidence JSON, and prints Pass/Fail verdict.

Scripts (978 LOC total, all bashrs lint clean):
- ship-002-discharge.sh — `apr run` + python AST parse, 0 syntax errors
- ship-005-discharge.sh — 3 HumanEval runs (seed=0), median pass@1 ≥ 86.00%
  (or ≥ 84.80% with 1.2 pp noise allowance)
- ship-006-discharge.sh — `apr qa --json`, all 8 gates pass
- ship-007-discharge.sh — `apr bench`, median ≥ 30.0 tok/s on RTX 4090
- ship-008-discharge.sh — `apr run --print-prompt`, byte-exact ChatML golden

Each script:
- Defaults to /mnt/nvme-raid0/targets/aprender/release/apr (lambda-vector
  canonical), accepts --apr-binary and --model overrides
- Writes canonical evidence/ship-XXX-full-discharge/discharge-evidence-v1.json
  matching the format used by SHIP-001/003/004 (already DISCHARGED)
- Exits 0 on Pass, 1 on Fail; preflight rejects bad apr-binary / missing jq
- Strict shell hygiene: set -euo pipefail, quoted vars, mktemp with EXIT trap

.bashrsignore updated with audited SEC001 suppression — false positive on
the literal substring "eval" in `apr eval` (apr-cli HumanEval subcommand,
not the bash `eval` builtin).

Includes top-level README.md documenting the dispatch matrix, operator
workflow, and prerequisites.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@noahgift noahgift force-pushed the feat/ship-discharges-5-live-dispatch-scripts branch from 1aca028 to 1021a67 Compare May 7, 2026 07:23
@noahgift noahgift merged commit 5abd613 into main May 7, 2026
10 checks passed
@noahgift noahgift deleted the feat/ship-discharges-5-live-dispatch-scripts branch May 7, 2026 07:48
noahgift added a commit that referenced this pull request May 7, 2026
…scharges (#1555)

* docs(contracts): SHIP-007 §22 upstream blocker RESOLVED — evidence pins on 5 MODEL-1 PARTIAL discharges

After M-FFN-GGUF-5 fix MERGED on aprender main 2026-05-07 (PR #1550
squash e856eb9), the §27 layer-3 ffn_swigl APR-vs-GGUF divergence
is closed: live H1 CONFIRMED at layer-3 ratio 1.245× (was 18.23×
pre-methodology-fix). 5 MODEL-1 PARTIAL discharges become live-
dispatch-ready: SHIP-002, SHIP-005, SHIP-006, SHIP-007, SHIP-008.

This PR adds evidence-pin annotations to each of the 3 contracts that
hold those discharges, citing PR #1550 as upstream §22 blocker
resolution. Pure additive YAML — no behavioral or test changes.

Contracts touched (3 contracts × 5 ACs):
- contracts/qwen2-e2e-verification-v1.yaml v1.10.0 → v1.11.0
  (FALSIFY-QW2E-SHIP-002, FALSIFY-QW2E-SHIP-005, FALSIFY-QW2E-SHIP-007)
- contracts/apr-model-qa-v1.yaml v1.2.0 → v1.3.0
  (FALSIFY-QA-SHIP-006)
- contracts/chat-template-v1.yaml v1.1.0 → v1.2.0
  (GATE-CHAT-SHIP-008)

Each contract's full_discharge_blocks_on clause now includes:
"Upstream blocker SHIP-007 §22 RESOLVED 2026-05-07 (aprender PR #1550
squash e856eb9; M-FFN-GGUF-5 fix); live discharge is now dispatch-
ready — no further upstream blockers."

This is bookkeeping work that captures the cascade outcome in the
contract surface so the next operator-dispatched LIVE-run session
has the citation ready. Each individual discharge still requires its
own LIVE run on RTX 4090 per the canonical command in
full_discharge_blocks_on (apr run / apr eval / apr bench / apr qa).
This PR does NOT promote PARTIAL_ALGORITHM_LEVEL → DISCHARGED — that
needs the LIVE evidence files.

Companion: scripts/ship-discharges/ship-XXX-discharge.sh dispatch
scripts authored in parallel by sub-agent (separate PR).

Test plan:
- [x] pv validate contracts/qwen2-e2e-verification-v1.yaml → 0 errors
- [x] pv validate contracts/apr-model-qa-v1.yaml → 0 errors
- [x] pv validate contracts/chat-template-v1.yaml → 0 errors
- [x] No code changes; production hot paths byte-unchanged

Refs PMAT-CCPA, SHIP-007 §22, M-FFN-GGUF-5 PR #1550.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(ship-discharges): 5 live-dispatch scripts for MODEL-1 PARTIAL discharges

After SHIP-007 §22 upstream blocker resolved (PR #1550 merged 2026-05-07),
SHIP-002/005/006/007/008 are LIVE-dispatch-ready. Each script runs the
canonical command from its contract's `full_discharge_blocks_on:` clause,
parses output, emits evidence JSON, and prints Pass/Fail verdict.

Scripts (978 LOC total, all bashrs lint clean):
- ship-002-discharge.sh — `apr run` + python AST parse, 0 syntax errors
- ship-005-discharge.sh — 3 HumanEval runs (seed=0), median pass@1 ≥ 86.00%
  (or ≥ 84.80% with 1.2 pp noise allowance)
- ship-006-discharge.sh — `apr qa --json`, all 8 gates pass
- ship-007-discharge.sh — `apr bench`, median ≥ 30.0 tok/s on RTX 4090
- ship-008-discharge.sh — `apr run --print-prompt`, byte-exact ChatML golden

Each script:
- Defaults to /mnt/nvme-raid0/targets/aprender/release/apr (lambda-vector
  canonical), accepts --apr-binary and --model overrides
- Writes canonical evidence/ship-XXX-full-discharge/discharge-evidence-v1.json
  matching the format used by SHIP-001/003/004 (already DISCHARGED)
- Exits 0 on Pass, 1 on Fail; preflight rejects bad apr-binary / missing jq
- Strict shell hygiene: set -euo pipefail, quoted vars, mktemp with EXIT trap

.bashrsignore updated with audited SEC001 suppression — false positive on
the literal substring "eval" in `apr eval` (apr-cli HumanEval subcommand,
not the bash `eval` builtin).

Includes top-level README.md documenting the dispatch matrix, operator
workflow, and prerequisites.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant