feat(falsify-ship-004): MODEL-1 apr export gguf → llama.cpp PARTIAL discharge (9/10) by noahgift · Pull Request #1029 · paiml/aprender

noahgift · 2026-04-23T17:14:24Z

Summary

SHIP-TWO-001 spec v2.30.0 → v2.31.0: 9th compute-free MODEL-1 PARTIAL lever (stacked on PR #1028 which is v2.30.0 / SHIP-003).

Binds AC-SHIP1-004 (apr export --format gguf loads in llama.cpp) to three independent pure verdict fns at discharge_status: PARTIAL_ALGORITHM_LEVEL:

verdict_from_llama_cli_exit(i32) — POSIX zero-tolerance exit-code boundary (AC_SHIP1_004_LLAMA_CLI_SUCCESS_EXIT_CODE = 0)
verdict_from_gguf_magic_bytes(&[u8]) — canonical 4-byte b"GGUF" magic (AC_SHIP1_004_GGUF_MAGIC_BYTES)
verdict_from_gguf_version(u32) — set-membership over {2, 3} (AC_SHIP1_004_GGUF_SUPPORTED_VERSIONS), Fail-closed above-band

Triple mutation survey (first MODEL-1 discharge with 3 independent fns):

exit-code — 6 sections (POSIX success, adjacent-value Fail, classic failure bands {2, 127, 137, 139, 255}, i32 extrema, monotonicity sweep [-256, 256], provenance pin)
magic — 6 sections (canonical, magic+version header, 6 single-byte flips, 4 short-slice lengths, wrong-format magics {b"APR\0", b"APRN", zeros}, 4-byte provenance pin)
version — 5 sections (supported {2, 3}, predecessor {0, 1}, above-band {4, 5, 10, 100, 1M, u32::MAX}, exhaustive 0..=64, set-length pin)

Files changed (4):

crates/aprender-core/src/format/ship_004.rs (new, 422 lines, 3 verdict fns + 3 falsify tests)
crates/aprender-core/src/format/mod.rs (+6 lines module declaration)
contracts/qwen2-e2e-verification-v1.yaml (v1.4.0 → v1.5.0, adds FALSIFY-QW2E-SHIP-004 entry alongside existing SHIP-003/SHIP-007)
docs/specifications/aprender-train/ship-two-models-spec.md (v2.30.0 → v2.31.0, changelog + AC-SHIP1-004 table row annotated)

Coverage impact: MODEL-1 8/10 → 9/10 touched. 15 PARTIAL + 3 DISCHARGED across both models.

Full discharge blocks on: live apr export --format gguf paiml/qwen2.5-coder-7b-apache-q4k-v1.safetensors on RTX 4090 + shell out to upstream llama-cli -m qwen2.5-coder-7b.gguf --prompt "hello" --n-predict 4 and assert magic, version, AND exit code each Pass.

Test plan

cargo test -p aprender-core --lib format::ship_004 — 3 passed / 0 failed
cargo run -p aprender-contracts-cli --bin pv -- validate contracts/qwen2-e2e-verification-v1.yaml — 0 errors / 0 warnings
git cherry-pick onto feat/falsify-ship-003-partial-discharge resolved cleanly with contract v1.4.0 → v1.5.0 bump and spec v2.30.0 → v2.31.0 bump
CI gates (ci / gate, workspace-test) green before merge

🤖 Generated with Claude Code

…ischarge (8/10) SHIP-TWO-001 spec v2.28.0 → v2.29.0: 8th compute-free MODEL-1 PARTIAL lever, binding AC-SHIP1-004 (`apr export --format gguf` loads in llama.cpp) to three pure verdict fns at `discharge_status: PARTIAL_ALGORITHM_LEVEL`. New: `crates/aprender-core/src/format/ship_004.rs` - const AC_SHIP1_004_LLAMA_CLI_SUCCESS_EXIT_CODE: i32 = 0 - const AC_SHIP1_004_GGUF_MAGIC_BYTES: &[u8; 4] = b"GGUF" - const AC_SHIP1_004_GGUF_SUPPORTED_VERSIONS: &[u32] = &[2, 3] - enum Ship004Verdict { Pass, Fail } - fn verdict_from_llama_cli_exit(code: i32) -> Ship004Verdict (POSIX zero-tolerance: code == 0 → Pass) - fn verdict_from_gguf_magic_bytes(bytes: &[u8]) -> Ship004Verdict (len >= 4 AND bytes[..4] == b"GGUF" → Pass; short slice → Fail) - fn verdict_from_gguf_version(ver: u32) -> Ship004Verdict (ver ∈ {2, 3} → Pass; Fail-closed above-band) Triple mutation survey (first MODEL-1 discharge with 3 independent fns): 1. falsify_ship_004_llama_cli_exit_code_logic — 6 sections: POSIX success boundary (0 → Pass), adjacent-value Fail {1, -1}, classic failure bands {2, 127, 137, 139, 255}, i32 extrema {i32::MIN, i32::MAX}, monotonicity sweep [-256, 256], provenance pin. 2. falsify_ship_004_gguf_magic_bytes_logic — 6 sections: canonical b"GGUF" Pass, magic+version header Pass, 6 single-byte flips {b"GGUE", b"GGTF", b"GFUF", b"FUGG", b"GGU ", b"gguf"}, 4 short-slice lengths {0, 1, 2, 3 bytes}, wrong-format magics {b"APR\0", b"APRN", zero-filled}, 4-byte provenance pin (G=0x47, G=0x47, U=0x55, F=0x46). 3. falsify_ship_004_gguf_version_logic — 5 sections: supported {2, 3} Pass, predecessor {0, 1} Fail (GGJT rejected), above-band Fail-closed {4, 5, 10, 100, 1M, u32::MAX}, exhaustive sweep 0..=64, set-length provenance pin. Contract: contracts/qwen2-e2e-verification-v1.yaml v1.3.0 → v1.4.0 ACTIVE. FALSIFY-QW2E-SHIP-004 now carries `discharge_status: PARTIAL_ALGORITHM_LEVEL`, 3 `evidence_discharged_by` test pins (one per verdict fn), `full_discharge_blocks_on` (live export + shell out to upstream `llama-cli -m <file> --prompt "hello" --n-predict 4`), and 7 counter_example_classes (silent_header_corruption, version_drift, llama_cpp_reject, tolerance_widened, magic_byte_flip, version_set_widened, signal_exit_masked). Spec: docs/specifications/aprender-train/ship-two-models-spec.md v2.28.0 → v2.29.0. AC-SHIP1-004 row annotated `FALSIFY-SHIP-004 **(PARTIAL_ALGORITHM_LEVEL v2.29.0)**`. Changelog documents first MODEL-1 AC discharged via three independent verdict fns on three different format boundaries (executable tool boundary + byte-literal magic + version set). Coverage: MODEL-1 7/10 → 8/10; 14 PARTIAL + 3 DISCHARGED across both models. Verification: - `cargo test -p aprender-core --lib format::ship_004` → 3 passed / 0 failed - `cargo fmt -p aprender-core --check` → clean - `pv validate contracts/qwen2-e2e-verification-v1.yaml` → 0 errors, 0 warnings Full discharge blocks on: live `apr export --format gguf paiml/qwen2.5-coder-7b-apache-q4k-v1.safetensors` + shelling out to upstream `llama-cli` on the exported `.gguf` and asserting all three gates (magic bytes, version, exit code) each Pass. Stacked on feat/falsify-ship-003-partial-discharge. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Completes the MODEL-1 compute-free PARTIAL coverage at 10/10 touched. Stacked follow-up to SHIP-003 (#1028) + SHIP-004 (#1029). Spec changes: - **Version:** 2.31.0 → 2.32.0 - Date line appended with v2.32.0 entry describing the three pure verdict fns in `crates/aprender-core/src/format/ship_001.rs`, the three bound constants (AC_SHIP1_001_SAFETENSORS_HEADER_PREFIX_LEN = 8, AC_SHIP1_001_SAFETENSORS_JSON_OPEN_BYTE = 0x7B, Result-boundary), the triple mutation survey (Result × header-size × open-byte), `cargo test -p aprender-core --lib format::ship_001` green (3/3), full-discharge blocker (live `realizar::Model::load_safetensors` on RTX 4090 with `--features cuda`), MODEL-1 9/10 → 10/10 coverage, and the 16 PARTIAL + 3 DISCHARGED aggregate count across both models. - §4.2 AC-SHIP1-001 row annotated **(PARTIAL_ALGORITHM_LEVEL v2.32.0)**. Completes MODEL-1 to 10/10 touched; only AC-SHIP1-009 (license / provenance metadata) remains pending in the MODEL-1 table. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…TIAL discharge (10/10) (#1030) * WIP: FALSIFY-SHIP-001 PARTIAL — MODEL-1 reproducible build verdict fns Stacked atop SHIP-003 (f9c2d47) + SHIP-004 (5f1db6a). Pushed as safety net before /tmp clears — NOT PR-ready yet. Contents: - crates/aprender-core/src/format/ship_001.rs (NEW): 3 pure verdict fns + 3/3 tests green locally. - crates/aprender-core/src/format/mod.rs: adds `pub mod ship_001`. - contracts/qwen2-e2e-verification-v1.yaml: speculative v1.4.0→v1.5.0 bump. Known follow-up before opening PR in next session: - Rebase onto main (now at 651e07b / post-SHIP-010) — main already carries publish-manifest-v1 v1.4.0 at SHIP-010, so the qwen2-e2e YAML bump here must be renumbered based on the landing order against current main. - Stack-push sequence per memory `project_ship_two_001_session_wrap_20260423.md`: SHIP-003 (task #162) → SHIP-004 (#164) → SHIP-001 (#165). - Full discharge of SHIP-001 blocks on live 3-run reproducible-build harness with sha256 manifest diff on RTX 4090 host. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * spec(falsify-ship-001): bump v2.31→v2.32 for SHIP-001 PARTIAL (10/10) Completes the MODEL-1 compute-free PARTIAL coverage at 10/10 touched. Stacked follow-up to SHIP-003 (#1028) + SHIP-004 (#1029). Spec changes: - **Version:** 2.31.0 → 2.32.0 - Date line appended with v2.32.0 entry describing the three pure verdict fns in `crates/aprender-core/src/format/ship_001.rs`, the three bound constants (AC_SHIP1_001_SAFETENSORS_HEADER_PREFIX_LEN = 8, AC_SHIP1_001_SAFETENSORS_JSON_OPEN_BYTE = 0x7B, Result-boundary), the triple mutation survey (Result × header-size × open-byte), `cargo test -p aprender-core --lib format::ship_001` green (3/3), full-discharge blocker (live `realizar::Model::load_safetensors` on RTX 4090 with `--features cuda`), MODEL-1 9/10 → 10/10 coverage, and the 16 PARTIAL + 3 DISCHARGED aggregate count across both models. - §4.2 AC-SHIP1-001 row annotated **(PARTIAL_ALGORITHM_LEVEL v2.32.0)**. Completes MODEL-1 to 10/10 touched; only AC-SHIP1-009 (license / provenance metadata) remains pending in the MODEL-1 table. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

noahgift enabled auto-merge (squash) April 23, 2026 17:14

This was referenced Apr 23, 2026

feat(falsify-ship-001): MODEL-1 realizar::Model::load_safetensors PARTIAL discharge (10/10) #1030

Merged

feat(falsify-ship-009): MODEL-1 apr-provenance multi-bind PARTIAL discharge (10/10 — last MODEL-1 row) #1031

Closed

noahgift force-pushed the feat/falsify-ship-004-partial-discharge branch from 95fd17e to 9036ab8 Compare April 23, 2026 17:35

noahgift merged commit 772ecb7 into main Apr 23, 2026
10 checks passed

noahgift deleted the feat/falsify-ship-004-partial-discharge branch April 23, 2026 18:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(falsify-ship-004): MODEL-1 apr export gguf → llama.cpp PARTIAL discharge (9/10)#1029

feat(falsify-ship-004): MODEL-1 apr export gguf → llama.cpp PARTIAL discharge (9/10)#1029
noahgift merged 1 commit into
mainfrom
feat/falsify-ship-004-partial-discharge

noahgift commented Apr 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

noahgift commented Apr 23, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant