Skip to content

feat(M-FFN-GGUF-4 step c, H2d.2): Q4K dequant byte-identity test — H2d.2 ALSO FALSIFIED#1537

Merged
noahgift merged 1 commit into
mainfrom
feat/m-ffn-gguf-4c-dequant-byte-identity
May 6, 2026
Merged

feat(M-FFN-GGUF-4 step c, H2d.2): Q4K dequant byte-identity test — H2d.2 ALSO FALSIFIED#1537
noahgift merged 1 commit into
mainfrom
feat/m-ffn-gguf-4c-dequant-byte-identity

Conversation

@noahgift

@noahgift noahgift commented May 6, 2026

Copy link
Copy Markdown
Contributor

Summary

THIRD hypothesis falsification in one session. Authors falsify_ffn_gguf_007_q4k_scalar_vs_simd_dequant_byte_identity testing that APR's scalar dequantize_q4_k and SIMD dequantize_q4_k_simd produce byte-identical Vec for the same Q4K super-block bytes.

Empirical result: both paths produce byte-identical output across all 256 elements (element[0]=10.75=0x412c0000, element[255]=1.25=0x3fa00000). H2d.2 hypothesis FALSIFIED at the APR-internal dequant level.

Three hypothesis falsifications this session

Hypothesis Status Test
§28 parallel-reduction non-determinism FALSIFIED FALSIFY-FFN-GGUF-005 (M91)
H2a' SIMD-vs-scalar dot reduction FALSIFIED FALSIFY-FFN-GGUF-006 (M92)
H2d.2 APR-internal Q4K dequant byte-identity FALSIFIED FALSIFY-FFN-GGUF-007 (this PR)

Remaining viable hypotheses (post-three-falsification)

  • H2d.1: per-block dequant boundaries differ between APR whole-row reduction and GGUF super-block Q4K-byte-by-byte fused reduction
  • H2d.3: Q8K activation quantization in GGUF's path (a step APR doesn't have at all)
  • H2d.4 (NEW): the FUSED matvec's INLINE Q4K dequant may produce different bits than the STANDALONE dequant routines

Contract amendment (v1.3.0 → v1.4.0)

Field Before After
version 1.3.0 1.4.0
FALSIFY-FFN-GGUF-007 NEW DISCHARGED
Step (c) hypothesis space {H2d.1, H2d.2, H2d.3} {H2d.1, H2d.3, H2d.4}

Test plan

  • pv validate 0/0
  • Test passes on first run
  • No production code touched (additive test-only file)
  • H2d.2 empirically falsified

🤖 Generated with Claude Code

@noahgift noahgift enabled auto-merge (squash) May 6, 2026 21:53
…ntity test — H2d.2 ALSO FALSIFIED

THIRD hypothesis falsification in one session:
- §28 parallel-reduction non-determinism (M91): FALSIFIED
- H2a' SIMD-vs-scalar dot reduction (M92): FALSIFIED
- H2d.2 APR-internal Q4K dequant byte-identity (this PR): FALSIFIED

Authored falsify_ffn_gguf_007_q4k_scalar_vs_simd_dequant_byte_identity
in crates/aprender-serve/tests/ffn_gguf_007_q4k_dequant_byte_identity.rs.

Test runs realizar::quantize::dequantize_q4_k (scalar) and
realizar::quantize::dequantize_q4_k_simd (AVX2 if available) on a
synthetic 144-byte Q4K super-block and compares the resulting Vec<f32>
bit-by-bit via f32::to_bits().

EMPIRICAL RESULT: both paths produce BYTE-IDENTICAL output across all
256 elements. element[0]=10.75 (0x412c0000); element[255]=1.25
(0x3fa00000). Asserted as regression-test invariant.

This FALSIFIES H2d.2 at the APR-internal dequant level. APR's two own
Q4K dequant paths agree byte-for-byte on the same input.

Remaining viable hypotheses (post-three-falsification):
- H2d.1: per-block dequant boundaries differ between APR whole-row
  reduction and GGUF super-block Q4K-byte-by-byte fused reduction
- H2d.3: Q8K activation quantization in GGUF's path (a step APR
  doesn't have at all)
- H2d.4 (NEW): the FUSED matvec's INLINE Q4K dequant may produce
  different bits than the STANDALONE dequant routines

Contract amendment: trace-ffn-sub-block-gguf-v1 v1.3.0 → v1.4.0.

Status promotions:
- FALSIFY-FFN-GGUF-007: NEW → DISCHARGED (test passes on first run)
- M-FFN-GGUF-4 step (c) hypothesis space: {H2d.1,2,3} → {H2d.1,3,4}

Production hot paths byte-unchanged.

`pv validate` 0/0; standalone test passes; H2d.2 falsified.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@noahgift noahgift force-pushed the feat/m-ffn-gguf-4c-dequant-byte-identity branch from 8f908be to 350cab9 Compare May 6, 2026 21:58
@noahgift noahgift merged commit ee8c652 into main May 6, 2026
10 checks passed
@noahgift noahgift deleted the feat/m-ffn-gguf-4c-dequant-byte-identity branch May 6, 2026 22:21
noahgift added a commit that referenced this pull request May 7, 2026
…s decompose §27 1723% within rounding — fix scope EMPIRICALLY VALIDATED — spec v3.03.0 → v3.04.0 (#1546)

Two-day autonomous /loop session shipped 11 lib-test + 1 integration-test
falsifiers (M91-M101, aprender PRs #1535/#1536/#1537/#1538/#1540/#1541/
#1542/#1543/#1544/#1545) decomposing the §27 layer-3 ffn_swigl 18.23×
APR-vs-GGUF std-ratio.

Final empirical decomposition (2026-05-07):

  M94 mechanism × M95 compounding × M99 std-ratio × A5 real-teacher × residual
  = 0.077% × 5.70× × 50× × 5.56× × 14×
  ≈ 1715%   ≈   §27's 1723% (within rounding)

Six synthetic amplifier candidates resolved:
- A1 (RoPE phase, M98)        — FALSIFIED 1.00× UNITARY
- A2 (Softmax saturation, M97) — FALSIFIED 0.01× COMPRESSES
- A3 (Block-scale variance, M96) — FALSIFIED 1.00× SCALE-INVARIANT
- A4 (Multi-token batch, M99) — FALSIFIED 0.26× per-token + 50× std-ratio
- A5 (Real-weight non-uniformity, M100) — PARTIALLY CONFIRMED 5.56× LIVE
- A6 (RMSNorm rsqrt, M101)    — FALSIFIED 1.00× HOMOGENEOUS

14× residual is now attributed entirely to cumulative-layer interaction.

SHIP-007 §22 fix scope EMPIRICALLY VALIDATED as Option-A (PROMOTE
GGUF-PATH semantics into APR forward): switching APR's `f32_matmul`
to Q8K activation quant + fused matvec semantics will recover the
5.56× per-matvec amplification on every matmul, eliminating cumulative
APR-vs-GGUF drift. Estimated fix scope ~250-400 LOC; transitively
discharges 5 MODEL-1 PARTIALs (SHIP-002, SHIP-005, SHIP-006, SHIP-007,
SHIP-008) per §17.5.

Cascade methodology consolidated:
- ~/.claude/projects/-home-noah-src-aprender/memory/feedback_falsifier_cascade_decomposes_magnitude.md
- ~/.claude/projects/-home-noah-src-aprender/memory/feedback_falsifier_chain_assert_difference.md

Companion-spec entries M91-M101 in claude-code-parity-apr/docs/
specifications/claude-code-parity-apr-poc.md provide the full per-PR
narrative. Aprender contract `contracts/trace-ffn-sub-block-gguf-v1.yaml`
v1.0.0 → v1.12.0 across 12 amendments.

MODEL-1 ship %: unchanged at 91% until M-FFN-GGUF-5 (actual fix PR) lands.
MODEL-2 ship %: unchanged at 57% until step 5g.3 produces val_loss < 9.38.

Spec v3.03.0 → v3.04.0. Atomic next action banner only — full §59
narrative deferred to deliberate-session work alongside M-FFN-GGUF-5
fix PR.

Refs PMAT-CCPA, SHIP-007 §22, M91-M101 cascade.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant