Skip to content

feat(falsify-ship-004): MODEL-1 apr export gguf → llama.cpp PARTIAL discharge (9/10)#1029

Merged
noahgift merged 1 commit into
mainfrom
feat/falsify-ship-004-partial-discharge
Apr 23, 2026
Merged

feat(falsify-ship-004): MODEL-1 apr export gguf → llama.cpp PARTIAL discharge (9/10)#1029
noahgift merged 1 commit into
mainfrom
feat/falsify-ship-004-partial-discharge

Conversation

@noahgift

Copy link
Copy Markdown
Contributor

Summary

SHIP-TWO-001 spec v2.30.0 → v2.31.0: 9th compute-free MODEL-1 PARTIAL lever (stacked on PR #1028 which is v2.30.0 / SHIP-003).

Binds AC-SHIP1-004 (apr export --format gguf loads in llama.cpp) to three independent pure verdict fns at discharge_status: PARTIAL_ALGORITHM_LEVEL:

  • verdict_from_llama_cli_exit(i32) — POSIX zero-tolerance exit-code boundary (AC_SHIP1_004_LLAMA_CLI_SUCCESS_EXIT_CODE = 0)
  • verdict_from_gguf_magic_bytes(&[u8]) — canonical 4-byte b"GGUF" magic (AC_SHIP1_004_GGUF_MAGIC_BYTES)
  • verdict_from_gguf_version(u32) — set-membership over {2, 3} (AC_SHIP1_004_GGUF_SUPPORTED_VERSIONS), Fail-closed above-band

Triple mutation survey (first MODEL-1 discharge with 3 independent fns):

  1. exit-code — 6 sections (POSIX success, adjacent-value Fail, classic failure bands {2, 127, 137, 139, 255}, i32 extrema, monotonicity sweep [-256, 256], provenance pin)
  2. magic — 6 sections (canonical, magic+version header, 6 single-byte flips, 4 short-slice lengths, wrong-format magics {b"APR\0", b"APRN", zeros}, 4-byte provenance pin)
  3. version — 5 sections (supported {2, 3}, predecessor {0, 1}, above-band {4, 5, 10, 100, 1M, u32::MAX}, exhaustive 0..=64, set-length pin)

Files changed (4):

  • crates/aprender-core/src/format/ship_004.rs (new, 422 lines, 3 verdict fns + 3 falsify tests)
  • crates/aprender-core/src/format/mod.rs (+6 lines module declaration)
  • contracts/qwen2-e2e-verification-v1.yaml (v1.4.0 → v1.5.0, adds FALSIFY-QW2E-SHIP-004 entry alongside existing SHIP-003/SHIP-007)
  • docs/specifications/aprender-train/ship-two-models-spec.md (v2.30.0 → v2.31.0, changelog + AC-SHIP1-004 table row annotated)

Coverage impact: MODEL-1 8/10 → 9/10 touched. 15 PARTIAL + 3 DISCHARGED across both models.

Full discharge blocks on: live apr export --format gguf paiml/qwen2.5-coder-7b-apache-q4k-v1.safetensors on RTX 4090 + shell out to upstream llama-cli -m qwen2.5-coder-7b.gguf --prompt "hello" --n-predict 4 and assert magic, version, AND exit code each Pass.

Test plan

  • cargo test -p aprender-core --lib format::ship_004 — 3 passed / 0 failed
  • cargo run -p aprender-contracts-cli --bin pv -- validate contracts/qwen2-e2e-verification-v1.yaml — 0 errors / 0 warnings
  • git cherry-pick onto feat/falsify-ship-003-partial-discharge resolved cleanly with contract v1.4.0 → v1.5.0 bump and spec v2.30.0 → v2.31.0 bump
  • CI gates (ci / gate, workspace-test) green before merge

🤖 Generated with Claude Code

…ischarge (8/10)

SHIP-TWO-001 spec v2.28.0 → v2.29.0: 8th compute-free MODEL-1 PARTIAL lever,
binding AC-SHIP1-004 (`apr export --format gguf` loads in llama.cpp) to three
pure verdict fns at `discharge_status: PARTIAL_ALGORITHM_LEVEL`.

New: `crates/aprender-core/src/format/ship_004.rs`
- const AC_SHIP1_004_LLAMA_CLI_SUCCESS_EXIT_CODE: i32 = 0
- const AC_SHIP1_004_GGUF_MAGIC_BYTES: &[u8; 4] = b"GGUF"
- const AC_SHIP1_004_GGUF_SUPPORTED_VERSIONS: &[u32] = &[2, 3]
- enum Ship004Verdict { Pass, Fail }
- fn verdict_from_llama_cli_exit(code: i32) -> Ship004Verdict
  (POSIX zero-tolerance: code == 0 → Pass)
- fn verdict_from_gguf_magic_bytes(bytes: &[u8]) -> Ship004Verdict
  (len >= 4 AND bytes[..4] == b"GGUF" → Pass; short slice → Fail)
- fn verdict_from_gguf_version(ver: u32) -> Ship004Verdict
  (ver ∈ {2, 3} → Pass; Fail-closed above-band)

Triple mutation survey (first MODEL-1 discharge with 3 independent fns):
1. falsify_ship_004_llama_cli_exit_code_logic — 6 sections:
   POSIX success boundary (0 → Pass), adjacent-value Fail {1, -1}, classic
   failure bands {2, 127, 137, 139, 255}, i32 extrema {i32::MIN, i32::MAX},
   monotonicity sweep [-256, 256], provenance pin.
2. falsify_ship_004_gguf_magic_bytes_logic — 6 sections:
   canonical b"GGUF" Pass, magic+version header Pass, 6 single-byte flips
   {b"GGUE", b"GGTF", b"GFUF", b"FUGG", b"GGU ", b"gguf"}, 4 short-slice
   lengths {0, 1, 2, 3 bytes}, wrong-format magics {b"APR\0", b"APRN",
   zero-filled}, 4-byte provenance pin (G=0x47, G=0x47, U=0x55, F=0x46).
3. falsify_ship_004_gguf_version_logic — 5 sections:
   supported {2, 3} Pass, predecessor {0, 1} Fail (GGJT rejected), above-band
   Fail-closed {4, 5, 10, 100, 1M, u32::MAX}, exhaustive sweep 0..=64,
   set-length provenance pin.

Contract: contracts/qwen2-e2e-verification-v1.yaml v1.3.0 → v1.4.0 ACTIVE.
FALSIFY-QW2E-SHIP-004 now carries `discharge_status:
PARTIAL_ALGORITHM_LEVEL`, 3 `evidence_discharged_by` test pins (one per
verdict fn), `full_discharge_blocks_on` (live export + shell out to
upstream `llama-cli -m <file> --prompt "hello" --n-predict 4`), and 7
counter_example_classes (silent_header_corruption, version_drift,
llama_cpp_reject, tolerance_widened, magic_byte_flip,
version_set_widened, signal_exit_masked).

Spec: docs/specifications/aprender-train/ship-two-models-spec.md v2.28.0 →
v2.29.0. AC-SHIP1-004 row annotated `FALSIFY-SHIP-004 **(PARTIAL_ALGORITHM_LEVEL
v2.29.0)**`. Changelog documents first MODEL-1 AC discharged via three
independent verdict fns on three different format boundaries (executable
tool boundary + byte-literal magic + version set). Coverage: MODEL-1 7/10
→ 8/10; 14 PARTIAL + 3 DISCHARGED across both models.

Verification:
- `cargo test -p aprender-core --lib format::ship_004` → 3 passed / 0 failed
- `cargo fmt -p aprender-core --check` → clean
- `pv validate contracts/qwen2-e2e-verification-v1.yaml` → 0 errors, 0 warnings

Full discharge blocks on: live `apr export --format gguf
paiml/qwen2.5-coder-7b-apache-q4k-v1.safetensors` + shelling out to
upstream `llama-cli` on the exported `.gguf` and asserting all three
gates (magic bytes, version, exit code) each Pass.

Stacked on feat/falsify-ship-003-partial-discharge.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@noahgift noahgift force-pushed the feat/falsify-ship-004-partial-discharge branch from 95fd17e to 9036ab8 Compare April 23, 2026 17:35
noahgift added a commit that referenced this pull request Apr 23, 2026
Completes the MODEL-1 compute-free PARTIAL coverage at 10/10 touched.
Stacked follow-up to SHIP-003 (#1028) + SHIP-004 (#1029).

Spec changes:
- **Version:** 2.31.0 → 2.32.0
- Date line appended with v2.32.0 entry describing the three pure verdict
  fns in `crates/aprender-core/src/format/ship_001.rs`, the three bound
  constants (AC_SHIP1_001_SAFETENSORS_HEADER_PREFIX_LEN = 8,
  AC_SHIP1_001_SAFETENSORS_JSON_OPEN_BYTE = 0x7B, Result-boundary), the
  triple mutation survey (Result × header-size × open-byte), `cargo test
  -p aprender-core --lib format::ship_001` green (3/3), full-discharge
  blocker (live `realizar::Model::load_safetensors` on RTX 4090 with
  `--features cuda`), MODEL-1 9/10 → 10/10 coverage, and the 16 PARTIAL
  + 3 DISCHARGED aggregate count across both models.
- §4.2 AC-SHIP1-001 row annotated **(PARTIAL_ALGORITHM_LEVEL v2.32.0)**.

Completes MODEL-1 to 10/10 touched; only AC-SHIP1-009 (license /
provenance metadata) remains pending in the MODEL-1 table.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@noahgift noahgift merged commit 772ecb7 into main Apr 23, 2026
10 checks passed
@noahgift noahgift deleted the feat/falsify-ship-004-partial-discharge branch April 23, 2026 18:00
noahgift added a commit that referenced this pull request Apr 23, 2026
Completes the MODEL-1 compute-free PARTIAL coverage at 10/10 touched.
Stacked follow-up to SHIP-003 (#1028) + SHIP-004 (#1029).

Spec changes:
- **Version:** 2.31.0 → 2.32.0
- Date line appended with v2.32.0 entry describing the three pure verdict
  fns in `crates/aprender-core/src/format/ship_001.rs`, the three bound
  constants (AC_SHIP1_001_SAFETENSORS_HEADER_PREFIX_LEN = 8,
  AC_SHIP1_001_SAFETENSORS_JSON_OPEN_BYTE = 0x7B, Result-boundary), the
  triple mutation survey (Result × header-size × open-byte), `cargo test
  -p aprender-core --lib format::ship_001` green (3/3), full-discharge
  blocker (live `realizar::Model::load_safetensors` on RTX 4090 with
  `--features cuda`), MODEL-1 9/10 → 10/10 coverage, and the 16 PARTIAL
  + 3 DISCHARGED aggregate count across both models.
- §4.2 AC-SHIP1-001 row annotated **(PARTIAL_ALGORITHM_LEVEL v2.32.0)**.

Completes MODEL-1 to 10/10 touched; only AC-SHIP1-009 (license /
provenance metadata) remains pending in the MODEL-1 table.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
noahgift added a commit that referenced this pull request Apr 23, 2026
…TIAL discharge (10/10) (#1030)

* WIP: FALSIFY-SHIP-001 PARTIAL — MODEL-1 reproducible build verdict fns

Stacked atop SHIP-003 (f9c2d47) + SHIP-004 (5f1db6a). Pushed as safety
net before /tmp clears — NOT PR-ready yet.

Contents:
- crates/aprender-core/src/format/ship_001.rs (NEW): 3 pure verdict fns
  + 3/3 tests green locally.
- crates/aprender-core/src/format/mod.rs: adds `pub mod ship_001`.
- contracts/qwen2-e2e-verification-v1.yaml: speculative v1.4.0→v1.5.0 bump.

Known follow-up before opening PR in next session:
- Rebase onto main (now at 651e07b / post-SHIP-010) — main already carries
  publish-manifest-v1 v1.4.0 at SHIP-010, so the qwen2-e2e YAML bump here
  must be renumbered based on the landing order against current main.
- Stack-push sequence per memory `project_ship_two_001_session_wrap_20260423.md`:
  SHIP-003 (task #162) → SHIP-004 (#164) → SHIP-001 (#165).
- Full discharge of SHIP-001 blocks on live 3-run reproducible-build harness
  with sha256 manifest diff on RTX 4090 host.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* spec(falsify-ship-001): bump v2.31→v2.32 for SHIP-001 PARTIAL (10/10)

Completes the MODEL-1 compute-free PARTIAL coverage at 10/10 touched.
Stacked follow-up to SHIP-003 (#1028) + SHIP-004 (#1029).

Spec changes:
- **Version:** 2.31.0 → 2.32.0
- Date line appended with v2.32.0 entry describing the three pure verdict
  fns in `crates/aprender-core/src/format/ship_001.rs`, the three bound
  constants (AC_SHIP1_001_SAFETENSORS_HEADER_PREFIX_LEN = 8,
  AC_SHIP1_001_SAFETENSORS_JSON_OPEN_BYTE = 0x7B, Result-boundary), the
  triple mutation survey (Result × header-size × open-byte), `cargo test
  -p aprender-core --lib format::ship_001` green (3/3), full-discharge
  blocker (live `realizar::Model::load_safetensors` on RTX 4090 with
  `--features cuda`), MODEL-1 9/10 → 10/10 coverage, and the 16 PARTIAL
  + 3 DISCHARGED aggregate count across both models.
- §4.2 AC-SHIP1-001 row annotated **(PARTIAL_ALGORITHM_LEVEL v2.32.0)**.

Completes MODEL-1 to 10/10 touched; only AC-SHIP1-009 (license /
provenance metadata) remains pending in the MODEL-1 table.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant