feat(falsify-ship-004): MODEL-1 apr export gguf → llama.cpp PARTIAL discharge (9/10)#1029
Merged
Merged
Conversation
…ischarge (8/10)
SHIP-TWO-001 spec v2.28.0 → v2.29.0: 8th compute-free MODEL-1 PARTIAL lever,
binding AC-SHIP1-004 (`apr export --format gguf` loads in llama.cpp) to three
pure verdict fns at `discharge_status: PARTIAL_ALGORITHM_LEVEL`.
New: `crates/aprender-core/src/format/ship_004.rs`
- const AC_SHIP1_004_LLAMA_CLI_SUCCESS_EXIT_CODE: i32 = 0
- const AC_SHIP1_004_GGUF_MAGIC_BYTES: &[u8; 4] = b"GGUF"
- const AC_SHIP1_004_GGUF_SUPPORTED_VERSIONS: &[u32] = &[2, 3]
- enum Ship004Verdict { Pass, Fail }
- fn verdict_from_llama_cli_exit(code: i32) -> Ship004Verdict
(POSIX zero-tolerance: code == 0 → Pass)
- fn verdict_from_gguf_magic_bytes(bytes: &[u8]) -> Ship004Verdict
(len >= 4 AND bytes[..4] == b"GGUF" → Pass; short slice → Fail)
- fn verdict_from_gguf_version(ver: u32) -> Ship004Verdict
(ver ∈ {2, 3} → Pass; Fail-closed above-band)
Triple mutation survey (first MODEL-1 discharge with 3 independent fns):
1. falsify_ship_004_llama_cli_exit_code_logic — 6 sections:
POSIX success boundary (0 → Pass), adjacent-value Fail {1, -1}, classic
failure bands {2, 127, 137, 139, 255}, i32 extrema {i32::MIN, i32::MAX},
monotonicity sweep [-256, 256], provenance pin.
2. falsify_ship_004_gguf_magic_bytes_logic — 6 sections:
canonical b"GGUF" Pass, magic+version header Pass, 6 single-byte flips
{b"GGUE", b"GGTF", b"GFUF", b"FUGG", b"GGU ", b"gguf"}, 4 short-slice
lengths {0, 1, 2, 3 bytes}, wrong-format magics {b"APR\0", b"APRN",
zero-filled}, 4-byte provenance pin (G=0x47, G=0x47, U=0x55, F=0x46).
3. falsify_ship_004_gguf_version_logic — 5 sections:
supported {2, 3} Pass, predecessor {0, 1} Fail (GGJT rejected), above-band
Fail-closed {4, 5, 10, 100, 1M, u32::MAX}, exhaustive sweep 0..=64,
set-length provenance pin.
Contract: contracts/qwen2-e2e-verification-v1.yaml v1.3.0 → v1.4.0 ACTIVE.
FALSIFY-QW2E-SHIP-004 now carries `discharge_status:
PARTIAL_ALGORITHM_LEVEL`, 3 `evidence_discharged_by` test pins (one per
verdict fn), `full_discharge_blocks_on` (live export + shell out to
upstream `llama-cli -m <file> --prompt "hello" --n-predict 4`), and 7
counter_example_classes (silent_header_corruption, version_drift,
llama_cpp_reject, tolerance_widened, magic_byte_flip,
version_set_widened, signal_exit_masked).
Spec: docs/specifications/aprender-train/ship-two-models-spec.md v2.28.0 →
v2.29.0. AC-SHIP1-004 row annotated `FALSIFY-SHIP-004 **(PARTIAL_ALGORITHM_LEVEL
v2.29.0)**`. Changelog documents first MODEL-1 AC discharged via three
independent verdict fns on three different format boundaries (executable
tool boundary + byte-literal magic + version set). Coverage: MODEL-1 7/10
→ 8/10; 14 PARTIAL + 3 DISCHARGED across both models.
Verification:
- `cargo test -p aprender-core --lib format::ship_004` → 3 passed / 0 failed
- `cargo fmt -p aprender-core --check` → clean
- `pv validate contracts/qwen2-e2e-verification-v1.yaml` → 0 errors, 0 warnings
Full discharge blocks on: live `apr export --format gguf
paiml/qwen2.5-coder-7b-apache-q4k-v1.safetensors` + shelling out to
upstream `llama-cli` on the exported `.gguf` and asserting all three
gates (magic bytes, version, exit code) each Pass.
Stacked on feat/falsify-ship-003-partial-discharge.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
95fd17e to
9036ab8
Compare
noahgift
added a commit
that referenced
this pull request
Apr 23, 2026
Completes the MODEL-1 compute-free PARTIAL coverage at 10/10 touched. Stacked follow-up to SHIP-003 (#1028) + SHIP-004 (#1029). Spec changes: - **Version:** 2.31.0 → 2.32.0 - Date line appended with v2.32.0 entry describing the three pure verdict fns in `crates/aprender-core/src/format/ship_001.rs`, the three bound constants (AC_SHIP1_001_SAFETENSORS_HEADER_PREFIX_LEN = 8, AC_SHIP1_001_SAFETENSORS_JSON_OPEN_BYTE = 0x7B, Result-boundary), the triple mutation survey (Result × header-size × open-byte), `cargo test -p aprender-core --lib format::ship_001` green (3/3), full-discharge blocker (live `realizar::Model::load_safetensors` on RTX 4090 with `--features cuda`), MODEL-1 9/10 → 10/10 coverage, and the 16 PARTIAL + 3 DISCHARGED aggregate count across both models. - §4.2 AC-SHIP1-001 row annotated **(PARTIAL_ALGORITHM_LEVEL v2.32.0)**. Completes MODEL-1 to 10/10 touched; only AC-SHIP1-009 (license / provenance metadata) remains pending in the MODEL-1 table. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
noahgift
added a commit
that referenced
this pull request
Apr 23, 2026
Completes the MODEL-1 compute-free PARTIAL coverage at 10/10 touched. Stacked follow-up to SHIP-003 (#1028) + SHIP-004 (#1029). Spec changes: - **Version:** 2.31.0 → 2.32.0 - Date line appended with v2.32.0 entry describing the three pure verdict fns in `crates/aprender-core/src/format/ship_001.rs`, the three bound constants (AC_SHIP1_001_SAFETENSORS_HEADER_PREFIX_LEN = 8, AC_SHIP1_001_SAFETENSORS_JSON_OPEN_BYTE = 0x7B, Result-boundary), the triple mutation survey (Result × header-size × open-byte), `cargo test -p aprender-core --lib format::ship_001` green (3/3), full-discharge blocker (live `realizar::Model::load_safetensors` on RTX 4090 with `--features cuda`), MODEL-1 9/10 → 10/10 coverage, and the 16 PARTIAL + 3 DISCHARGED aggregate count across both models. - §4.2 AC-SHIP1-001 row annotated **(PARTIAL_ALGORITHM_LEVEL v2.32.0)**. Completes MODEL-1 to 10/10 touched; only AC-SHIP1-009 (license / provenance metadata) remains pending in the MODEL-1 table. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
noahgift
added a commit
that referenced
this pull request
Apr 23, 2026
…TIAL discharge (10/10) (#1030) * WIP: FALSIFY-SHIP-001 PARTIAL — MODEL-1 reproducible build verdict fns Stacked atop SHIP-003 (f9c2d47) + SHIP-004 (5f1db6a). Pushed as safety net before /tmp clears — NOT PR-ready yet. Contents: - crates/aprender-core/src/format/ship_001.rs (NEW): 3 pure verdict fns + 3/3 tests green locally. - crates/aprender-core/src/format/mod.rs: adds `pub mod ship_001`. - contracts/qwen2-e2e-verification-v1.yaml: speculative v1.4.0→v1.5.0 bump. Known follow-up before opening PR in next session: - Rebase onto main (now at 651e07b / post-SHIP-010) — main already carries publish-manifest-v1 v1.4.0 at SHIP-010, so the qwen2-e2e YAML bump here must be renumbered based on the landing order against current main. - Stack-push sequence per memory `project_ship_two_001_session_wrap_20260423.md`: SHIP-003 (task #162) → SHIP-004 (#164) → SHIP-001 (#165). - Full discharge of SHIP-001 blocks on live 3-run reproducible-build harness with sha256 manifest diff on RTX 4090 host. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * spec(falsify-ship-001): bump v2.31→v2.32 for SHIP-001 PARTIAL (10/10) Completes the MODEL-1 compute-free PARTIAL coverage at 10/10 touched. Stacked follow-up to SHIP-003 (#1028) + SHIP-004 (#1029). Spec changes: - **Version:** 2.31.0 → 2.32.0 - Date line appended with v2.32.0 entry describing the three pure verdict fns in `crates/aprender-core/src/format/ship_001.rs`, the three bound constants (AC_SHIP1_001_SAFETENSORS_HEADER_PREFIX_LEN = 8, AC_SHIP1_001_SAFETENSORS_JSON_OPEN_BYTE = 0x7B, Result-boundary), the triple mutation survey (Result × header-size × open-byte), `cargo test -p aprender-core --lib format::ship_001` green (3/3), full-discharge blocker (live `realizar::Model::load_safetensors` on RTX 4090 with `--features cuda`), MODEL-1 9/10 → 10/10 coverage, and the 16 PARTIAL + 3 DISCHARGED aggregate count across both models. - §4.2 AC-SHIP1-001 row annotated **(PARTIAL_ALGORITHM_LEVEL v2.32.0)**. Completes MODEL-1 to 10/10 touched; only AC-SHIP1-009 (license / provenance metadata) remains pending in the MODEL-1 table. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
SHIP-TWO-001 spec v2.30.0 → v2.31.0: 9th compute-free MODEL-1 PARTIAL lever (stacked on PR #1028 which is v2.30.0 / SHIP-003).
Binds AC-SHIP1-004 (
apr export --format ggufloads in llama.cpp) to three independent pure verdict fns atdischarge_status: PARTIAL_ALGORITHM_LEVEL:verdict_from_llama_cli_exit(i32)— POSIX zero-tolerance exit-code boundary (AC_SHIP1_004_LLAMA_CLI_SUCCESS_EXIT_CODE = 0)verdict_from_gguf_magic_bytes(&[u8])— canonical 4-byteb"GGUF"magic (AC_SHIP1_004_GGUF_MAGIC_BYTES)verdict_from_gguf_version(u32)— set-membership over{2, 3}(AC_SHIP1_004_GGUF_SUPPORTED_VERSIONS), Fail-closed above-bandTriple mutation survey (first MODEL-1 discharge with 3 independent fns):
Files changed (4):
crates/aprender-core/src/format/ship_004.rs(new, 422 lines, 3 verdict fns + 3 falsify tests)crates/aprender-core/src/format/mod.rs(+6 lines module declaration)contracts/qwen2-e2e-verification-v1.yaml(v1.4.0 → v1.5.0, addsFALSIFY-QW2E-SHIP-004entry alongside existing SHIP-003/SHIP-007)docs/specifications/aprender-train/ship-two-models-spec.md(v2.30.0 → v2.31.0, changelog + AC-SHIP1-004 table row annotated)Coverage impact: MODEL-1 8/10 → 9/10 touched. 15 PARTIAL + 3 DISCHARGED across both models.
Full discharge blocks on: live
apr export --format gguf paiml/qwen2.5-coder-7b-apache-q4k-v1.safetensorson RTX 4090 + shell out to upstreamllama-cli -m qwen2.5-coder-7b.gguf --prompt "hello" --n-predict 4and assert magic, version, AND exit code each Pass.Test plan
cargo test -p aprender-core --lib format::ship_004— 3 passed / 0 failedcargo run -p aprender-contracts-cli --bin pv -- validate contracts/qwen2-e2e-verification-v1.yaml— 0 errors / 0 warningsgit cherry-pickontofeat/falsify-ship-003-partial-dischargeresolved cleanly with contract v1.4.0 → v1.5.0 bump and spec v2.30.0 → v2.31.0 bumpci / gate,workspace-test) green before merge🤖 Generated with Claude Code