feat(falsify-ship-003): MODEL-1 apr convert q4_k_m per-layer cos ≥ 0.999 PARTIAL discharge (8/10) by noahgift · Pull Request #1028 · paiml/aprender

noahgift · 2026-04-23T16:59:41Z

Summary

SHIP-TWO-001 spec v2.29.0 → v2.30.0: 8th compute-free MODEL-1 PARTIAL lever binding AC-SHIP1-003 (per-layer cosine similarity after apr convert --quantize q4_k_m) to pure verdict functions at discharge_status: PARTIAL_ALGORITHM_LEVEL.
Contract bump: contracts/qwen2-e2e-verification-v1.yaml v1.3.0 → v1.4.0 ACTIVE with FALSIFY-QW2E-SHIP-003 annotated with 3 evidence_discharged_by pins, full_discharge_blocks_on (real 7B .apr + 28×7=196 projection matrix harness on RTX 4090), and 7 counter-example classes.
New Rust binding: crates/aprender-core/src/format/ship_003.rs
- const AC_SHIP1_003_MIN_COSINE_SIMILARITY: f32 = 0.999
- verdict_from_cosine_similarity(sim, threshold) — f32-threshold with [-1.0, 1.0] range guard + non-finite rejection
- verdict_from_per_layer_cosines(sims, threshold) — aggregate-AND; empty → Fail; short-circuit on first Fail
Twin mutation surveys (15 sections total):
1. falsify_ship_003_cosine_similarity_threshold_logic — 8 sections: exact boundary, ULP-below, safe-above/below bands, monotonic sweep, non-finite, out-of-range, provenance pin.
2. falsify_ship_003_per_layer_aggregate_and — 7 sections: all-Pass 196, single-Fail, all-Fail, empty-Fail, single-element, first-layer NaN/OOR short-circuit, last-layer Fail.
Coverage: MODEL-1 7/10 → 8/10; 14 PARTIAL + 3 DISCHARGED across both models.
First MODEL-1 PARTIAL to combine a single-number threshold (SHIP-007/SHIP-020 shape) with an aggregate-AND combinator (SHIP-016 shape) in one discharge.

Test plan

cargo test -p aprender-core --lib format::ship_003 — 2/2 passed
pv validate contracts/qwen2-e2e-verification-v1.yaml — 0 errors, 0 warnings
Cherry-pick conflict resolution layered SHIP-003 (v2.30.0) on top of SHIP-007 (v2.29.0) and SHIP-010 (v2.28.0) without dropping history
Full discharge blocks on live apr convert --quantize q4_k_m paiml/qwen2.5-coder-7b-apache-q4k-v1.safetensors on RTX 4090 + apr diff per-layer cosine harness (separate compute-dispatch task)

🤖 Generated with Claude Code

…999 PARTIAL discharge (7/10) SHIP-TWO-001 spec v2.27.0 → v2.28.0: 7th compute-free MODEL-1 PARTIAL lever, binding AC-SHIP1-003 (per-layer cosine similarity after `apr convert --quantize q4_k_m`) to pure verdict functions at `discharge_status: PARTIAL_ALGORITHM_LEVEL`. New: `crates/aprender-core/src/format/ship_003.rs` - const AC_SHIP1_003_MIN_COSINE_SIMILARITY: f32 = 0.999 - enum Ship003Verdict { Pass, Fail } - fn verdict_from_cosine_similarity(sim: f32, threshold: f32) -> Ship003Verdict (f32-threshold with range guard + non-finite rejection) - fn verdict_from_per_layer_cosines(sims: &[f32], threshold: f32) -> Ship003Verdict (aggregate-AND over per-layer vector; empty → Fail; short-circuit on first Fail) Twin mutation surveys: 1. falsify_ship_003_cosine_similarity_threshold_logic — 8 sections: exact boundary, ULP-below (`f32::from_bits(0x3F7FBE77 - 1)`), safe-above {0.9999, 1.0}, safe-below {0.998, 0.5, 0.0, -1.0}, monotonic sweep [0.990..1.0] step 1e-4, non-finite (NaN/+∞/-∞) on both sim+threshold, out-of-range guards ({-1.5, 1.5, -2.0, 2.0}), provenance pin assert_eq const == 0.999_f32. 2. falsify_ship_003_per_layer_aggregate_and — 7 sections: all-Pass 196 (28 layers × 7 projections), single-Fail at index 100, all-Fail, empty-Fail (conservative), single-element both directions, first-layer NaN/OOR short-circuit, last-layer Fail not short-circuited. Contract: contracts/qwen2-e2e-verification-v1.yaml v1.2.0 → v1.3.0 ACTIVE. FALSIFY-QW2E-SHIP-003 now annotated with `discharge_status: PARTIAL_ALGORITHM_LEVEL`, 3 `evidence_discharged_by` test pins, `full_discharge_blocks_on` (real 7B .apr + 28×7=196 projection matrix harness on RTX 4090), and 7 counter_example_classes (regressed_quantizer, drifted_floor, relaxed_rule, empty_vector_pass, range_guard_bypass, nan_promoted, sign_flipped_quantizer). Spec: docs/specifications/aprender-train/ship-two-models-spec.md v2.27.0 → v2.28.0. AC-SHIP1-003 row annotated `FALSIFY-SHIP-003 **(PARTIAL_ALGORITHM_LEVEL v2.28.0)**`. Changelog documents first MODEL-1 PARTIAL combining single-number threshold shape (mirrors SHIP-007/SHIP-020) with aggregate-AND combinator (mirrors SHIP-016) in one discharge. Coverage: MODEL-1 6/10 → 7/10; 13 PARTIAL + 3 DISCHARGED across both models. Verification: - `cargo test -p aprender-core --lib format::ship_003` → 2 passed / 0 failed - `cargo fmt -p aprender-core --check` → clean - `pv validate contracts/qwen2-e2e-verification-v1.yaml` → 0 errors, 0 warnings Full discharge blocks on: MODEL-2 lambda-labs 7B .apr + real 196-projection cosine-parity harness runner (separate task #126 compute-dispatch). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Completes the MODEL-1 compute-free PARTIAL coverage at 10/10 touched. Stacked follow-up to SHIP-003 (#1028) + SHIP-004 (#1029). Spec changes: - **Version:** 2.31.0 → 2.32.0 - Date line appended with v2.32.0 entry describing the three pure verdict fns in `crates/aprender-core/src/format/ship_001.rs`, the three bound constants (AC_SHIP1_001_SAFETENSORS_HEADER_PREFIX_LEN = 8, AC_SHIP1_001_SAFETENSORS_JSON_OPEN_BYTE = 0x7B, Result-boundary), the triple mutation survey (Result × header-size × open-byte), `cargo test -p aprender-core --lib format::ship_001` green (3/3), full-discharge blocker (live `realizar::Model::load_safetensors` on RTX 4090 with `--features cuda`), MODEL-1 9/10 → 10/10 coverage, and the 16 PARTIAL + 3 DISCHARGED aggregate count across both models. - §4.2 AC-SHIP1-001 row annotated **(PARTIAL_ALGORITHM_LEVEL v2.32.0)**. Completes MODEL-1 to 10/10 touched; only AC-SHIP1-009 (license / provenance metadata) remains pending in the MODEL-1 table. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…TIAL discharge (10/10) (#1030) * WIP: FALSIFY-SHIP-001 PARTIAL — MODEL-1 reproducible build verdict fns Stacked atop SHIP-003 (f9c2d47) + SHIP-004 (5f1db6a). Pushed as safety net before /tmp clears — NOT PR-ready yet. Contents: - crates/aprender-core/src/format/ship_001.rs (NEW): 3 pure verdict fns + 3/3 tests green locally. - crates/aprender-core/src/format/mod.rs: adds `pub mod ship_001`. - contracts/qwen2-e2e-verification-v1.yaml: speculative v1.4.0→v1.5.0 bump. Known follow-up before opening PR in next session: - Rebase onto main (now at 651e07b / post-SHIP-010) — main already carries publish-manifest-v1 v1.4.0 at SHIP-010, so the qwen2-e2e YAML bump here must be renumbered based on the landing order against current main. - Stack-push sequence per memory `project_ship_two_001_session_wrap_20260423.md`: SHIP-003 (task #162) → SHIP-004 (#164) → SHIP-001 (#165). - Full discharge of SHIP-001 blocks on live 3-run reproducible-build harness with sha256 manifest diff on RTX 4090 host. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * spec(falsify-ship-001): bump v2.31→v2.32 for SHIP-001 PARTIAL (10/10) Completes the MODEL-1 compute-free PARTIAL coverage at 10/10 touched. Stacked follow-up to SHIP-003 (#1028) + SHIP-004 (#1029). Spec changes: - **Version:** 2.31.0 → 2.32.0 - Date line appended with v2.32.0 entry describing the three pure verdict fns in `crates/aprender-core/src/format/ship_001.rs`, the three bound constants (AC_SHIP1_001_SAFETENSORS_HEADER_PREFIX_LEN = 8, AC_SHIP1_001_SAFETENSORS_JSON_OPEN_BYTE = 0x7B, Result-boundary), the triple mutation survey (Result × header-size × open-byte), `cargo test -p aprender-core --lib format::ship_001` green (3/3), full-discharge blocker (live `realizar::Model::load_safetensors` on RTX 4090 with `--features cuda`), MODEL-1 9/10 → 10/10 coverage, and the 16 PARTIAL + 3 DISCHARGED aggregate count across both models. - §4.2 AC-SHIP1-001 row annotated **(PARTIAL_ALGORITHM_LEVEL v2.32.0)**. Completes MODEL-1 to 10/10 touched; only AC-SHIP1-009 (license / provenance metadata) remains pending in the MODEL-1 table. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

noahgift enabled auto-merge (squash) April 23, 2026 17:00

This was referenced Apr 23, 2026

feat(falsify-ship-004): MODEL-1 apr export gguf → llama.cpp PARTIAL discharge (9/10) #1029

Merged

feat(falsify-ship-001): MODEL-1 realizar::Model::load_safetensors PARTIAL discharge (10/10) #1030

Merged

noahgift merged commit ad9ff4d into main Apr 23, 2026
11 checks passed

noahgift deleted the feat/falsify-ship-003-partial-discharge branch April 23, 2026 17:27

noahgift mentioned this pull request Apr 23, 2026

feat(falsify-ship-009): MODEL-1 apr-provenance multi-bind PARTIAL discharge (10/10 — last MODEL-1 row) #1031

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(falsify-ship-003): MODEL-1 apr convert q4_k_m per-layer cos ≥ 0.999 PARTIAL discharge (8/10)#1028

feat(falsify-ship-003): MODEL-1 apr convert q4_k_m per-layer cos ≥ 0.999 PARTIAL discharge (8/10)#1028
noahgift merged 1 commit into
mainfrom
feat/falsify-ship-003-partial-discharge

noahgift commented Apr 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

noahgift commented Apr 23, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant