contract(trace-ffn-sub-block-gguf-v1): v1.0.0 → v1.1.0 — §27 evidence integrated, M-FFN-GGUF-3 DISCHARGED by noahgift · Pull Request #1534 · paiml/aprender

noahgift · 2026-05-06T13:54:08Z

Summary

Same-day post-M88+M89 follow-up. Discovered that ship-two-models-spec.md v2.72.0 §27 records the H1/H2 bisection had already been LIVE-run on noah-Lambda-Vector RTX 4090 on 2026-04-27 — verdict: H2 CONFIRMED (APR-side bug, ratio 18.23×).

§27 evidence

Metric	Value
APR layer-3 ffn_swigl std	1.2216
GGUF layer-3 ffn_swigl std	0.0670
Ratio	18.23× (far exceeds §26.4 ≥10× threshold)
Verdict	H2 CONFIRMED (APR-side bug)

Status promotions

Field	Before (v1.0.0)	After (v1.1.0)
metadata.status	PROPOSED	ACTIVE_ALGORITHM_LEVEL
M-FFN-GGUF-3 stage	ALGORITHM_LEVEL_DISCHARGED	DISCHARGED
FALSIFY-FFN-GGUF-003	PROPOSED	DISCHARGED

Why this needed a separate amendment

Methodology lesson #2 firing in retrospect: had I grep'd the spec for §22 / §27 BEFORE authoring M88's contract scaffold, the M-FFN-GGUF-3 status would have been DISCHARGED at v1.0.0 instead of needing this v1.1.0 follow-up amendment. The M89 harness (PR #1533) adds regression-test coverage but the §27 data remains the canonical operator-dispatched discharge proof.

Remaining work

Only M-FFN-GGUF-4 (SHIP-007 fix PR) remains PENDING — gated on engineering investigation of inference.rs SwiGLU site at :298-302 (line shifted from spec's :160-164 after sub-FFN telemetry was added in PR #1066).

3 candidate hypotheses for the layer-3-specific behavior within the SwiGLU block authored in v1.1.0 amendment for M-FFN-GGUF-4 investigation:

H2a: Buffer aliasing / scratch-buffer corruption in APR multi-token forward
H2b: Layer-3-specific upstream divergence (gate or up at L3 only)
H2c: Quantization dequant alignment differs at certain layer configs

Test plan

pv validate 0/0
No production code touched (YAML-only)
§27 evidence cited verbatim from ship-two-models-spec.md v2.72.0

🤖 Generated with Claude Code

… integrated, M-FFN-GGUF-3 DISCHARGED Same-day post-M88+M89 follow-up: ship-two-models-spec.md v2.72.0 §27 records that the H1/H2 bisection has ALREADY been LIVE-run on noah-Lambda-Vector RTX 4090 on 2026-04-27 (built `apr` from PR #1083 branch + commits 77c016b + c657968 + f249464): APR layer-3 ffn_swigl std = 1.2216 GGUF layer-3 ffn_swigl std = 0.0670 Ratio = 18.23× Verdict = H2 CONFIRMED (APR-side bug) This far exceeds the §26.4 ≥10× threshold by 8× absolute. Status promotions in v1.1.0: - M-FFN-GGUF-3 implementation_stage: ALGORITHM_LEVEL_DISCHARGED → DISCHARGED - FALSIFY-FFN-GGUF-003: PROPOSED → DISCHARGED - contract metadata.status: PROPOSED → ACTIVE_ALGORITHM_LEVEL The M89 PR #1533 harness (falsify_ffn_gguf_003_layer_3_swigl_h1_h2_bisection) adds regression-test coverage for any future re-run; the §27 data remains the canonical operator-dispatched discharge proof. Only M-FFN-GGUF-4 (SHIP-007 fix PR) remains PENDING — gated on engineering investigation of `inference.rs` SwiGLU site (line shifted to 298-302 post sub-FFN telemetry from §22 spec authoring at :160-164). 3 candidate hypotheses for the layer-3-specific behavior within the SwiGLU block authored in v1.1.0 amendment for M-FFN-GGUF-4 investigation: - H2a: Buffer aliasing / scratch-buffer corruption in APR multi-token - H2b: Layer-3-specific upstream divergence (gate or up at L3 only) - H2c: Quantization dequant alignment differs at certain layer configs YAML-only — production hot paths byte-unchanged (this amendment records pre-existing §27 evidence + corrects status drift). Methodology lesson #2 firing in retrospect: had I grep'd the spec for §22 / §27 BEFORE authoring M88's contract scaffold, the M-FFN-GGUF-3 status would have been DISCHARGED at v1.0.0 instead of needing this v1.1.0 follow-up amendment. `pv validate` 0/0. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…tion hypothesis FALSIFIED (#1535) Authors 2 lib-only determinism falsifiers (FALSIFY-FFN-GGUF-005) in apr_transformer::helpers::determinism_tests: falsify_ffn_gguf_005_f32_matmul_byte_deterministic_above_parallel_threshold falsify_ffn_gguf_005b_f32_matmul_byte_deterministic_below_parallel_threshold Both tests run `f32_matmul` TWICE with identical synthetic inputs (out_dim above + below F32_PARALLEL_THRESHOLD=256) and assert byte-identical output via f32::to_bits() comparison. BOTH TESTS PASS on first run. APR's f32_matmul (and the underlying f32_matvec_parallel rayon-parallel kernel) is byte-deterministic across repeated calls. This FALSIFIES the §28 parallel-reduction hypothesis at the kernel level. The §27 layer-3 18.23× drift is NOT caused by APR being non-deterministic with itself. REFINED HYPOTHESIS (post-§28 falsification): The cumulative APR↔GGUF drift must be a DIFFERENCE between APR's and GGUF's reduction order, not non-determinism within APR. APR uses simd_dot_f32_avx2 (4-wide FMA, 8-element AVX2 chunks); GGUF uses fused_q4k_q8k_parallel_matvec_into (different unroll + block boundaries). F32 sum-of-products is non-associative; different unroll → different bit-level results. NEXT M-FFN-GGUF-4 INVESTIGATION STEP: Cross-implementation deterministic-difference test — run APR's f32_matmul AND GGUF's fused_q4k_q8k_parallel_matvec_into on byte-identical synthetic inputs and assert whether outputs match. If they differ at the bit level, fix scope = align reduction order. Contract amendment: trace-ffn-sub-block-gguf-v1 v1.1.0 → v1.2.0. Status promotions: - FALSIFY-FFN-GGUF-005: NEW → DISCHARGED (tests pass) - M-FFN-GGUF-4 step (a): PENDING → SHIPPED Stages (b) cross-impl diff + (c) fix remain PENDING. Methodology lesson #2 firing prophylactically: branched off main AFTER #1534 (v1.1.0) merged to avoid cascade-ordering rebase conflict. Verified by `git rebase origin/main` succeeding cleanly with the v1.1.0 amendment intact. `pv validate` 0/0; 2 lib tests pass; production hot paths byte-unchanged. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

noahgift enabled auto-merge (squash) May 6, 2026 13:54

noahgift merged commit ea148db into main May 6, 2026
11 checks passed

noahgift deleted the contract/ffn-gguf-v1.1.0-section-27-evidence branch May 6, 2026 14:17

noahgift mentioned this pull request May 6, 2026

feat(M-FFN-GGUF-4 step a): determinism falsifier — §28 parallel-reduction hypothesis FALSIFIED #1535

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

contract(trace-ffn-sub-block-gguf-v1): v1.0.0 → v1.1.0 — §27 evidence integrated, M-FFN-GGUF-3 DISCHARGED#1534

contract(trace-ffn-sub-block-gguf-v1): v1.0.0 → v1.1.0 — §27 evidence integrated, M-FFN-GGUF-3 DISCHARGED#1534
noahgift merged 1 commit into
mainfrom
contract/ffn-gguf-v1.1.0-section-27-evidence

noahgift commented May 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

noahgift commented May 6, 2026

Summary

§27 evidence

Status promotions

Why this needed a separate amendment

Remaining work

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant