docs(ship-two-001): §27 — P3 binding criterion DECIDED, SHIP-007 confirmed APR-side (ratio=18.23×) — spec v2.71.0 → v2.72.0#1084
Merged
Conversation
…irmed APR-side at inference.rs:160-164 — spec v2.71.0 → v2.72.0 Live evidence on noah-Lambda-Vector RTX 4090 2026-04-27. Built apr from PR #1083 branch (commits 77c016b + c657968 + f249464 from PR A+B+C cascade). Ran `apr trace --payload` on canonical 7B teacher in BOTH formats with identical prompt + tokenizer. Result: | Layer | APR ffn_swigl std | GGUF ffn_swigl std | Ratio | |------:|------------------:|-------------------:|------:| | 3 | 1.2216 | 0.0670 | 18.23x | §26.4 binding criterion threshold: ≥10x → APR-side bug. **Observed 18.23x — 8x past the threshold, decisive verdict.** The investigation chain that started in §15.4 (GPU GQA elimination) has reached its conclusion at §27: §15.4 → §16 → §17 → §23 → §27 (this) "Whole forward path" → "GPU eliminated" → "(layer=3, FFN sub-block)" → "(layer=3, ffn_swigl)" → "**APR-side at inference.rs:160-164**" Cascade-damping signature confirmed: - Layers 0-2: ratio ~1.1x (normal) - Layer 3: 18.23x (anomaly) - Layers 4-5: 3.3-4.5x (cascade) - Layer 6+: ~1x (recovered) This is consistent with a localized perturbation (off-by-one, buffer aliasing, or F32-vs-Q4K dequant defect at layer-3- specifically) rather than persistent residual-stream corruption. Per §17.5, SHIP-007 fix discharges 5 MODEL-1 PARTIALs at once (SHIP-002/005/006/007/008). §26.5 expected coverage flip: 33+12 → 28+17 when fix lands. §27 does NOT discharge by itself — it locates the bug for fixing. Next investigation reads `inference.rs:160-164` and tests 4 hypotheses: 1. Off-by-one slice indexing 2. Buffer aliasing (scratch reuse pattern) 3. F32-vs-Q4K dequant defect at layer-3 input range 4. Activation overflow (SiLU saturation amplifies multiply) Methodology held throughout: zero eprintln!, zero route-arounds, apr is canonical (§26.8), all instrumentation via `apr trace --payload`. Lambda-labs lane pre-authorized. Evidence persisted to evidence/ship-007-apr-vs-gguf-2026-04-27/: - apr-trace.txt (13.5 KB) - gguf-trace.txt (13.7 KB) - binding-criterion-summary.json Note: §27 reproduction requires PR #1081 + #1082 + #1083 cascade to merge first (the apr trace --payload <gguf> wiring is in PR C). Evidence was generated with a local build of PR #1083 branch. Spec v2.71.0 → v2.72.0. Coverage flip pending fix. Spec: SPEC-SHIP-TWO-001 §26.4 P3 verdict References: - §15.4 (PR #1062) — GPU GQA eliminated - §16 (PR #1063) — APR CPU isolated - §17 (PR #1064) — layer-3 FFN sub-block - §23 (PR #1075) — layer-3 ffn_swigl named - §26.8 (PR #1079) — apr-is-canonical methodology rule - PR #1081 (P3 PR A scaffold) - PR #1082 (P3 PR B sub-FFN populate) - PR #1083 (P3 PR C CLI wiring) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
noahgift
added a commit
that referenced
this pull request
Apr 27, 2026
…oard + critical-path map — spec v2.73.0 → v2.74.0 (#1087) Session-end snapshot consolidating today's 10-PR cascade into a single source-of-truth for next session. The goal: ship two models to HF, both built end-to-end on the in-tree Sovereign AI Stack. Coverage scoreboard EOD 2026-04-27: | Category | DISCHARGED | PARTIAL | Total | %D | |-------------|-----------:|--------:|------:|----:| | MODEL-1 | 5 | 5 | 10 | 50% | | MODEL-2 | 3 | 9 | 12 | 25% | | GPUTRAIN | 7 | 0 | 7 |100% | | Ship Gates | - | 12 | 12 | 0% | | Falsifiers | - | 7 | 7 | 0% | | Sum | 15 | 33 | 48 | 31% | Critical path — MODEL-1: PR E (replace helpers::f32_matmul with Q4K-fused dispatch) discharges 5 PARTIALs at one fix site. ~150-300 LOC. Critical path — MODEL-2: P1.1 (apr pull dataset extension) → P1.4 (corpus pull) → P2 (100K-step training) discharges 9 PARTIALs. 10-PR session cascade (6 merged, 4 open + this): - #1076-#1080: spec + contract foundation (MERGED) - #1081: P3 PR A scaffold (MERGED) - #1082-#1083: P3 PR B+C wiring (OPEN, stacked) - #1084-#1085: §27/§28 binding criterion + root cause (OPEN) - #1086: PR D forward-parity contract (OPEN) Falsification chain (complete, root-reached): §15.4 → §16 → §17 → §23 → §27 → §28 → PR D contract → PR E (next) "forward path" → ... → "APR F32 vs GGUF Q4K matmul precision" → "binding criterion as durable spec" → "fix at mod_apr_transformer.rs:138-140" Methodology preserved: zero eprintln!, zero route-arounds, apr canonical, contract-first, lambda-labs pre-authorized, 5-whys reaches root. Next session: PR E first (5 ACs), then P1.1 + P1.4 + P2 (9 ACs). Spec v2.73.0 → v2.74.0. No coverage flip at amendment — §29 is a scoreboard, not a discharge. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
noahgift
added a commit
that referenced
this pull request
Apr 27, 2026
…ct codifying the §28 binding criterion PR D of the SHIP-TWO-001 §28.8 falsifiable PR sequence. Authors a provable contract that defines the per-layer ffn_swigl parity binding criterion as durable spec. Status PROPOSED until PR E (the actual fix replacing helpers::f32_matmul with Q4K-fused matmul dispatch) lands. 3 equations: - per_layer_ffn_swigl_parity: r_i = APR.std / GGUF.std ∈ [0.5, 2.0] for all i ∈ [0, 28). Currently FAILS at layer 3 (r_3 = 18.23×). - divergence_starts_at_gate_matmul: §28 evidence — divergence originates at gate-projection matmul (1.36×), amplified by silu (4.59×) into the 18.23× ffn_swigl ratio. - fix_must_match_gguf_kernel_path: §28.4 — fix replaces f32_matmul with fused_q4k_q8k_parallel_matvec_into when weight.qtype == GGUF_TYPE_Q4_K. 6 falsification tests: - FALSIFY-APR-GGUF-PARITY-001: per-layer ffn_swigl ratio bounds - -002: layer 3 specifically - -003: gate matmul precision is the root cause (Toyota Way enforcement — prevents route-around fix at silu_g*u) - -004: pv validate - -005: F32-native paths unchanged - -006: apr trace --payload still emits ffn_swigl on GGUF 4 proof obligations + 2 Kani harnesses with bounds. Validation: $ pv validate contracts/apr-vs-gguf-forward-parity-v1.yaml 0 error(s), 0 warning(s) Contract is valid. $ pv score contracts/apr-vs-gguf-forward-parity-v1.yaml apr-vs-gguf-forward-parity-v1 — 0.71 (Grade C) Spec: 0.70 | Falsify: 1.00 | Kani: 0.25 | Lean: 0.50 | Bind: 1.00 Status: PROPOSED. Promotion to ACTIVE requires: - PR E lands (replaces f32_matmul with Q4K-fused dispatch) - Live drift-prevention test PASSES on canonical 7B teacher - All 6 FALSIFY-APR-GGUF-PARITY-* gates pass On PR E success: - Coverage flip 33+12 → 28+17 (§26.5 / §28.9) - Discharges SHIP-002, SHIP-005, SHIP-006, SHIP-007, SHIP-008 (5 MODEL-1 PARTIALs transitively gated on §17.5) This PR (D) ships the binding criterion as durable spec. PR E ships the fix. §29 records the discharge. Spec: SPEC-SHIP-TWO-001 §28.8 References: - §27 (PR #1084) — P3 binding criterion verdict (18.23× ratio) - §28 (PR #1085) — root cause refined to F32 vs Q4K matmul - evidence/ship-007-apr-vs-gguf-2026-04-27/ — full sub-FFN bisection - feedback_fix_root_cause_never_route_around.md - contracts/qwen2-e2e-verification-v1.yaml (sibling MODEL-1 contract) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
noahgift
added a commit
that referenced
this pull request
Apr 27, 2026
…ct codifying the §28 binding criterion PR D of the SHIP-TWO-001 §28.8 falsifiable PR sequence. Authors a provable contract that defines the per-layer ffn_swigl parity binding criterion as durable spec. Status PROPOSED until PR E (the actual fix replacing helpers::f32_matmul with Q4K-fused matmul dispatch) lands. 3 equations: - per_layer_ffn_swigl_parity: r_i = APR.std / GGUF.std ∈ [0.5, 2.0] for all i ∈ [0, 28). Currently FAILS at layer 3 (r_3 = 18.23×). - divergence_starts_at_gate_matmul: §28 evidence — divergence originates at gate-projection matmul (1.36×), amplified by silu (4.59×) into the 18.23× ffn_swigl ratio. - fix_must_match_gguf_kernel_path: §28.4 — fix replaces f32_matmul with fused_q4k_q8k_parallel_matvec_into when weight.qtype == GGUF_TYPE_Q4_K. 6 falsification tests: - FALSIFY-APR-GGUF-PARITY-001: per-layer ffn_swigl ratio bounds - -002: layer 3 specifically - -003: gate matmul precision is the root cause (Toyota Way enforcement — prevents route-around fix at silu_g*u) - -004: pv validate - -005: F32-native paths unchanged - -006: apr trace --payload still emits ffn_swigl on GGUF 4 proof obligations + 2 Kani harnesses with bounds. Validation: $ pv validate contracts/apr-vs-gguf-forward-parity-v1.yaml 0 error(s), 0 warning(s) Contract is valid. $ pv score contracts/apr-vs-gguf-forward-parity-v1.yaml apr-vs-gguf-forward-parity-v1 — 0.71 (Grade C) Spec: 0.70 | Falsify: 1.00 | Kani: 0.25 | Lean: 0.50 | Bind: 1.00 Status: PROPOSED. Promotion to ACTIVE requires: - PR E lands (replaces f32_matmul with Q4K-fused dispatch) - Live drift-prevention test PASSES on canonical 7B teacher - All 6 FALSIFY-APR-GGUF-PARITY-* gates pass On PR E success: - Coverage flip 33+12 → 28+17 (§26.5 / §28.9) - Discharges SHIP-002, SHIP-005, SHIP-006, SHIP-007, SHIP-008 (5 MODEL-1 PARTIALs transitively gated on §17.5) This PR (D) ships the binding criterion as durable spec. PR E ships the fix. §29 records the discharge. Spec: SPEC-SHIP-TWO-001 §28.8 References: - §27 (PR #1084) — P3 binding criterion verdict (18.23× ratio) - §28 (PR #1085) — root cause refined to F32 vs Q4K matmul - evidence/ship-007-apr-vs-gguf-2026-04-27/ — full sub-FFN bisection - feedback_fix_root_cause_never_route_around.md - contracts/qwen2-e2e-verification-v1.yaml (sibling MODEL-1 contract) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
noahgift
added a commit
that referenced
this pull request
Apr 27, 2026
…ct codifying the §28 binding criterion (#1086) PR D of the SHIP-TWO-001 §28.8 falsifiable PR sequence. Authors a provable contract that defines the per-layer ffn_swigl parity binding criterion as durable spec. Status PROPOSED until PR E (the actual fix replacing helpers::f32_matmul with Q4K-fused matmul dispatch) lands. 3 equations: - per_layer_ffn_swigl_parity: r_i = APR.std / GGUF.std ∈ [0.5, 2.0] for all i ∈ [0, 28). Currently FAILS at layer 3 (r_3 = 18.23×). - divergence_starts_at_gate_matmul: §28 evidence — divergence originates at gate-projection matmul (1.36×), amplified by silu (4.59×) into the 18.23× ffn_swigl ratio. - fix_must_match_gguf_kernel_path: §28.4 — fix replaces f32_matmul with fused_q4k_q8k_parallel_matvec_into when weight.qtype == GGUF_TYPE_Q4_K. 6 falsification tests: - FALSIFY-APR-GGUF-PARITY-001: per-layer ffn_swigl ratio bounds - -002: layer 3 specifically - -003: gate matmul precision is the root cause (Toyota Way enforcement — prevents route-around fix at silu_g*u) - -004: pv validate - -005: F32-native paths unchanged - -006: apr trace --payload still emits ffn_swigl on GGUF 4 proof obligations + 2 Kani harnesses with bounds. Validation: $ pv validate contracts/apr-vs-gguf-forward-parity-v1.yaml 0 error(s), 0 warning(s) Contract is valid. $ pv score contracts/apr-vs-gguf-forward-parity-v1.yaml apr-vs-gguf-forward-parity-v1 — 0.71 (Grade C) Spec: 0.70 | Falsify: 1.00 | Kani: 0.25 | Lean: 0.50 | Bind: 1.00 Status: PROPOSED. Promotion to ACTIVE requires: - PR E lands (replaces f32_matmul with Q4K-fused dispatch) - Live drift-prevention test PASSES on canonical 7B teacher - All 6 FALSIFY-APR-GGUF-PARITY-* gates pass On PR E success: - Coverage flip 33+12 → 28+17 (§26.5 / §28.9) - Discharges SHIP-002, SHIP-005, SHIP-006, SHIP-007, SHIP-008 (5 MODEL-1 PARTIALs transitively gated on §17.5) This PR (D) ships the binding criterion as durable spec. PR E ships the fix. §29 records the discharge. Spec: SPEC-SHIP-TWO-001 §28.8 References: - §27 (PR #1084) — P3 binding criterion verdict (18.23× ratio) - §28 (PR #1085) — root cause refined to F32 vs Q4K matmul - evidence/ship-007-apr-vs-gguf-2026-04-27/ — full sub-FFN bisection - feedback_fix_root_cause_never_route_around.md - contracts/qwen2-e2e-verification-v1.yaml (sibling MODEL-1 contract) Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
§26.4 P3 binding criterion DECIDED via live evidence on noah-Lambda-Vector RTX 4090, 2026-04-27.
§26.4 outcome matrix:
Verdict: SHIP-007 is an APR-side bug at
crates/aprender-serve/src/apr_transformer/inference.rs:160-164(thesilu_g * uelement-wise multiply). 18.23× is 8× past the threshold — no statistical wiggle room.Investigation chain — fully closed
Cascade-damping signature
Localized perturbation at layer 3 — consistent with off-by-one or buffer-aliasing bug, not persistent residual-stream corruption.
Discharge consequence
Per §17.5, SHIP-007 fix discharges 5 MODEL-1 PARTIALs at once: SHIP-002/005/006/007/008.
§26.5 expected coverage flip: 33+12 → 28+17 when fix lands.
§27 does NOT discharge by itself — it locates the bug for fixing.
Reproduction
§27 reproduction requires PR #1081 + #1082 + #1083 cascade to merge first. Evidence was generated with a local build of PR #1083 branch (commit f249464):
Methodology
eprintln!(all instrumentation viaapr trace --payload)apris canonical (§26.8) — trace primitive lives in apr-cli, not in a sidecar toolEvidence persisted
Test plan
🤖 Generated with Claude Code