feat(falsify-ship-001): MODEL-1 realizar::Model::load_safetensors PARTIAL discharge (10/10)#1030
Merged
Merged
Conversation
4 tasks
a26a1ed to
aef83f8
Compare
Stacked atop SHIP-003 (f9c2d47) + SHIP-004 (5f1db6a). Pushed as safety net before /tmp clears — NOT PR-ready yet. Contents: - crates/aprender-core/src/format/ship_001.rs (NEW): 3 pure verdict fns + 3/3 tests green locally. - crates/aprender-core/src/format/mod.rs: adds `pub mod ship_001`. - contracts/qwen2-e2e-verification-v1.yaml: speculative v1.4.0→v1.5.0 bump. Known follow-up before opening PR in next session: - Rebase onto main (now at 651e07b / post-SHIP-010) — main already carries publish-manifest-v1 v1.4.0 at SHIP-010, so the qwen2-e2e YAML bump here must be renumbered based on the landing order against current main. - Stack-push sequence per memory `project_ship_two_001_session_wrap_20260423.md`: SHIP-003 (task #162) → SHIP-004 (#164) → SHIP-001 (#165). - Full discharge of SHIP-001 blocks on live 3-run reproducible-build harness with sha256 manifest diff on RTX 4090 host. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Completes the MODEL-1 compute-free PARTIAL coverage at 10/10 touched. Stacked follow-up to SHIP-003 (#1028) + SHIP-004 (#1029). Spec changes: - **Version:** 2.31.0 → 2.32.0 - Date line appended with v2.32.0 entry describing the three pure verdict fns in `crates/aprender-core/src/format/ship_001.rs`, the three bound constants (AC_SHIP1_001_SAFETENSORS_HEADER_PREFIX_LEN = 8, AC_SHIP1_001_SAFETENSORS_JSON_OPEN_BYTE = 0x7B, Result-boundary), the triple mutation survey (Result × header-size × open-byte), `cargo test -p aprender-core --lib format::ship_001` green (3/3), full-discharge blocker (live `realizar::Model::load_safetensors` on RTX 4090 with `--features cuda`), MODEL-1 9/10 → 10/10 coverage, and the 16 PARTIAL + 3 DISCHARGED aggregate count across both models. - §4.2 AC-SHIP1-001 row annotated **(PARTIAL_ALGORITHM_LEVEL v2.32.0)**. Completes MODEL-1 to 10/10 touched; only AC-SHIP1-009 (license / provenance metadata) remains pending in the MODEL-1 table. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
aef83f8 to
086f9c0
Compare
5 tasks
noahgift
added a commit
that referenced
this pull request
Apr 25, 2026
…eacher safetensors + YAML backfill SHIP-TWO-001 spec v2.52.0 → v2.53.0: FALSIFY-QW2E-SHIP-001 / FALSIFY-SHIP-001 (AC-SHIP1-001) flipped PARTIAL_ALGORITHM_LEVEL → DISCHARGED on noah-Lambda-Vector RTX 4090. Third MODEL-1 PARTIAL → DISCHARGED of the cycle (after SHIP-009 PR #1054 + SHIP-010 PR #1055). Two-in-one PR: 1. **Backfill missing FALSIFY-QW2E-SHIP-001 YAML block.** PR #1030 added the Rust verdict fns at `crates/aprender-core/src/format/ship_001.rs` + claimed v1.6.0 wired the YAML entry, but the actual `falsification_tests` block was never written to disk. This PR closes that gap by adding the block at `qwen2-e2e-verification-v1.yaml` v1.7.0 → v1.8.0. 2. **Promote directly to DISCHARGED with live evidence.** Skip the PARTIAL state because both algorithm proof (the three triple- verdict fns from v1.6.0: verdict_from_load_result, verdict_from_safetensors_header_size, verdict_from_safetensors_json_open_byte + 2 byte-literal constants AC_SHIP1_001_SAFETENSORS_HEADER_PREFIX_LEN=8 + AC_SHIP1_001_SAFETENSORS_JSON_OPEN_BYTE=0x7B) AND live evidence (apr inspect on the canonical teacher safetensors) exist concurrently. Live discharge evidence (noah-Lambda-Vector RTX 4090): $ apr inspect /mnt/nvme-raid0/models/ship-two-001/qwen2.5-coder-7b-instruct-q4k.safetensors --json { "architecture": "qwen2", "format": "SafeTensors", "file_size": 15231938404, "tensor_count": 339, "total_params": 7615616512, ... } apr inspect exit 0 + format=SafeTensors + tensor_count=339 + total_params=7,615,616,512 (Qwen2.5-Coder-7B canonical counts) proves Model::load_safetensors returned Ok(_) end-to-end on the 15.23 GB shipped artifact. Err(_) would have surfaced as non-zero exit + error JSON. Files changed: - contracts/qwen2-e2e-verification-v1.yaml v1.7.0 → v1.8.0 Added FALSIFY-QW2E-SHIP-001 falsification_tests block at discharge_status: DISCHARGED with full discharged_evidence block (host=noah-Lambda-Vector, command, file_size_bytes=15231938404, tensor_count=339, total_params=7615616512, architecture=qwen2, overall=PASS, evidence_discharged_by_live array). - crates/aprender-core/src/format/ship_001.rs Added drift-prevention test `falsify_ship_001_yaml_binding_pins_discharged_status` that parses qwen2-e2e-verification-v1.yaml, locates the FALSIFY-QW2E-SHIP-001 block, and asserts: * Block exists (catches the YAML backfill regression) * discharge_status == "DISCHARGED" * ship_blocking == true * discharged_evidence.host == "noah-Lambda-Vector" * discharged_evidence.overall == "PASS" * discharged_evidence.tensor_count == 339 * discharged_evidence.total_params == 7,615,616,512 * discharged_evidence.evidence_discharged_by_live non-empty - docs/specifications/aprender-train/ship-two-models-spec.md v2.52.0 → v2.53.0 with full atomic-next-action narrative. Coverage tally: 39 PARTIAL + 6 DISCHARGED → 38 + 7. Note: PR #1054 (SHIP-009) + PR #1055 (SHIP-010) bump to the same v2.53.0 simultaneously; last to merge rebases. - evidence/ship-001-full-discharge/discharge-evidence-v1.json (NEW) Self-contained discharge summary with all metadata, command, verification chain, tooling-chain proof, discharge rationale. - evidence/ship-001-full-discharge/inspect-safetensors.json (NEW) Raw `apr inspect --json` output from the canonical teacher safetensors path on noah-Lambda-Vector. Verification (all green): - cargo test -p aprender-core --lib ship_001 — 5/5 passes (3 algorithm + 1 gate + 1 new YAML binding) - pv validate contracts/qwen2-e2e-verification-v1.yaml — PASS - Live `apr inspect <teacher>.safetensors --json` exit 0 + all expected fields present Methodological note: zero `eprintln!`, zero bash workaround. Pure `apr inspect` end-to-end on a 15.23 GB shipped artifact. Honors `feedback_apr_trace_not_eprintln.md` and `feedback_pv_not_bash_for_contracts.md`. Mirrors the SHIP-009 / SHIP-010 pattern. Memory: feedback_compute_pre_authorized.md (lambda-labs lane pre-authorized), reference_lambda_labs_host_locality.md (this host IS lambda-labs). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
noahgift
added a commit
that referenced
this pull request
Apr 25, 2026
…eacher safetensors + YAML backfill (#1056) SHIP-TWO-001 spec v2.52.0 → v2.53.0: FALSIFY-QW2E-SHIP-001 / FALSIFY-SHIP-001 (AC-SHIP1-001) flipped PARTIAL_ALGORITHM_LEVEL → DISCHARGED on noah-Lambda-Vector RTX 4090. Third MODEL-1 PARTIAL → DISCHARGED of the cycle (after SHIP-009 PR #1054 + SHIP-010 PR #1055). Two-in-one PR: 1. **Backfill missing FALSIFY-QW2E-SHIP-001 YAML block.** PR #1030 added the Rust verdict fns at `crates/aprender-core/src/format/ship_001.rs` + claimed v1.6.0 wired the YAML entry, but the actual `falsification_tests` block was never written to disk. This PR closes that gap by adding the block at `qwen2-e2e-verification-v1.yaml` v1.7.0 → v1.8.0. 2. **Promote directly to DISCHARGED with live evidence.** Skip the PARTIAL state because both algorithm proof (the three triple- verdict fns from v1.6.0: verdict_from_load_result, verdict_from_safetensors_header_size, verdict_from_safetensors_json_open_byte + 2 byte-literal constants AC_SHIP1_001_SAFETENSORS_HEADER_PREFIX_LEN=8 + AC_SHIP1_001_SAFETENSORS_JSON_OPEN_BYTE=0x7B) AND live evidence (apr inspect on the canonical teacher safetensors) exist concurrently. Live discharge evidence (noah-Lambda-Vector RTX 4090): $ apr inspect /mnt/nvme-raid0/models/ship-two-001/qwen2.5-coder-7b-instruct-q4k.safetensors --json { "architecture": "qwen2", "format": "SafeTensors", "file_size": 15231938404, "tensor_count": 339, "total_params": 7615616512, ... } apr inspect exit 0 + format=SafeTensors + tensor_count=339 + total_params=7,615,616,512 (Qwen2.5-Coder-7B canonical counts) proves Model::load_safetensors returned Ok(_) end-to-end on the 15.23 GB shipped artifact. Err(_) would have surfaced as non-zero exit + error JSON. Files changed: - contracts/qwen2-e2e-verification-v1.yaml v1.7.0 → v1.8.0 Added FALSIFY-QW2E-SHIP-001 falsification_tests block at discharge_status: DISCHARGED with full discharged_evidence block (host=noah-Lambda-Vector, command, file_size_bytes=15231938404, tensor_count=339, total_params=7615616512, architecture=qwen2, overall=PASS, evidence_discharged_by_live array). - crates/aprender-core/src/format/ship_001.rs Added drift-prevention test `falsify_ship_001_yaml_binding_pins_discharged_status` that parses qwen2-e2e-verification-v1.yaml, locates the FALSIFY-QW2E-SHIP-001 block, and asserts: * Block exists (catches the YAML backfill regression) * discharge_status == "DISCHARGED" * ship_blocking == true * discharged_evidence.host == "noah-Lambda-Vector" * discharged_evidence.overall == "PASS" * discharged_evidence.tensor_count == 339 * discharged_evidence.total_params == 7,615,616,512 * discharged_evidence.evidence_discharged_by_live non-empty - docs/specifications/aprender-train/ship-two-models-spec.md v2.52.0 → v2.53.0 with full atomic-next-action narrative. Coverage tally: 39 PARTIAL + 6 DISCHARGED → 38 + 7. Note: PR #1054 (SHIP-009) + PR #1055 (SHIP-010) bump to the same v2.53.0 simultaneously; last to merge rebases. - evidence/ship-001-full-discharge/discharge-evidence-v1.json (NEW) Self-contained discharge summary with all metadata, command, verification chain, tooling-chain proof, discharge rationale. - evidence/ship-001-full-discharge/inspect-safetensors.json (NEW) Raw `apr inspect --json` output from the canonical teacher safetensors path on noah-Lambda-Vector. Verification (all green): - cargo test -p aprender-core --lib ship_001 — 5/5 passes (3 algorithm + 1 gate + 1 new YAML binding) - pv validate contracts/qwen2-e2e-verification-v1.yaml — PASS - Live `apr inspect <teacher>.safetensors --json` exit 0 + all expected fields present Methodological note: zero `eprintln!`, zero bash workaround. Pure `apr inspect` end-to-end on a 15.23 GB shipped artifact. Honors `feedback_apr_trace_not_eprintln.md` and `feedback_pv_not_bash_for_contracts.md`. Mirrors the SHIP-009 / SHIP-010 pattern. Memory: feedback_compute_pre_authorized.md (lambda-labs lane pre-authorized), reference_lambda_labs_host_locality.md (this host IS lambda-labs). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Wires FALSIFY-QW2E-SHIP-001 (AC-SHIP1-001,
realizar::Model::load_safetensors(path)returnsOk(_)) atPARTIAL_ALGORITHM_LEVELby binding three independent pure verdict fns incrates/aprender-core/src/format/ship_001.rs:verdict_from_load_result(ok_flag: bool) -> Ship001Verdict— Result-boundary collapse to Pass-on-Okverdict_from_safetensors_header_size(header_size: u64, file_len: u64) -> Ship001Verdict— header-size invariant0 < N <= file_len − 8bound atAC_SHIP1_001_SAFETENSORS_HEADER_PREFIX_LEN = 8verdict_from_safetensors_json_open_byte(byte: u8) -> Ship001Verdict— byte-literal check atAC_SHIP1_001_SAFETENSORS_JSON_OPEN_BYTE = b'{' = 0x7BConstants pinned to the canonical safetensors layout (little-endian u64 prefix + JSON object header). Triple mutation survey covers the Result boundary, header-size bounds (zero / overflow / off-by-one at
file_len − 8), and the JSON open-byte (adjacent-byte Fail + exhaustive 0..=255 sweep). 3/3 tests green locally viacargo test -p aprender-core --lib format::ship_001.MODEL-1 coverage 9/10 → 10/10 touched — tenth compute-free MODEL-1 PARTIAL lever, completes MODEL-1 to 10/10 touched; only AC-SHIP1-009 (license/provenance metadata) remains pending. Second triple-verdict decomposition after SHIP-004.
Full discharge blocks on live
realizar::Model::load_safetensors(paiml/qwen2.5-coder-7b-apache-q4k-v1.safetensors)on RTX 4090 with--features cuda(or equivalent realizar binary).Stacking note
Stacked on:
feat/falsify-ship-003-partial-discharge(PR feat(falsify-ship-003): MODEL-1 apr convert q4_k_m per-layer cos ≥ 0.999 PARTIAL discharge (8/10) #1028, SHIP-003 / contract v1.4.0 / spec v2.30.0)feat/falsify-ship-004-partial-discharge(PR feat(falsify-ship-004): MODEL-1 apr export gguf → llama.cpp PARTIAL discharge (9/10) #1029, SHIP-004 / contract v1.5.0 / spec v2.31.0)This PR bumps
contracts/qwen2-e2e-verification-v1.yamlv1.5.0 → v1.6.0 anddocs/specifications/aprender-train/ship-two-models-spec.mdv2.31.0 → v2.32.0. Aggregate count across both models: 16 PARTIAL + 3 DISCHARGED.Merge order expected: #1028 → #1029 → this one. GitHub will auto-rebase onto current main after #1028 + #1029 land.
Changes
crates/aprender-core/src/format/ship_001.rs(NEW, 423 lines)crates/aprender-core/src/format/mod.rs— addspub mod ship_001;contracts/qwen2-e2e-verification-v1.yaml— v1.5.0 → v1.6.0 description paragraphdocs/specifications/aprender-train/ship-two-models-spec.md— Version 2.31.0 → 2.32.0, v2.32.0 changelog entry, §4.2 AC-SHIP1-001 row annotatedTest plan
cargo test -p aprender-core --lib format::ship_001→ 3 passedcargo run -p aprender-contracts-cli --bin pv -- validate contracts/qwen2-e2e-verification-v1.yaml→ 0 errorsci / gate+workspace-test) passes🤖 Generated with Claude Code