feat(M-FFN-GGUF-4 step c, H2d.2): Q4K dequant byte-identity test — H2d.2 ALSO FALSIFIED#1537
Merged
Merged
Conversation
…ntity test — H2d.2 ALSO FALSIFIED
THIRD hypothesis falsification in one session:
- §28 parallel-reduction non-determinism (M91): FALSIFIED
- H2a' SIMD-vs-scalar dot reduction (M92): FALSIFIED
- H2d.2 APR-internal Q4K dequant byte-identity (this PR): FALSIFIED
Authored falsify_ffn_gguf_007_q4k_scalar_vs_simd_dequant_byte_identity
in crates/aprender-serve/tests/ffn_gguf_007_q4k_dequant_byte_identity.rs.
Test runs realizar::quantize::dequantize_q4_k (scalar) and
realizar::quantize::dequantize_q4_k_simd (AVX2 if available) on a
synthetic 144-byte Q4K super-block and compares the resulting Vec<f32>
bit-by-bit via f32::to_bits().
EMPIRICAL RESULT: both paths produce BYTE-IDENTICAL output across all
256 elements. element[0]=10.75 (0x412c0000); element[255]=1.25
(0x3fa00000). Asserted as regression-test invariant.
This FALSIFIES H2d.2 at the APR-internal dequant level. APR's two own
Q4K dequant paths agree byte-for-byte on the same input.
Remaining viable hypotheses (post-three-falsification):
- H2d.1: per-block dequant boundaries differ between APR whole-row
reduction and GGUF super-block Q4K-byte-by-byte fused reduction
- H2d.3: Q8K activation quantization in GGUF's path (a step APR
doesn't have at all)
- H2d.4 (NEW): the FUSED matvec's INLINE Q4K dequant may produce
different bits than the STANDALONE dequant routines
Contract amendment: trace-ffn-sub-block-gguf-v1 v1.3.0 → v1.4.0.
Status promotions:
- FALSIFY-FFN-GGUF-007: NEW → DISCHARGED (test passes on first run)
- M-FFN-GGUF-4 step (c) hypothesis space: {H2d.1,2,3} → {H2d.1,3,4}
Production hot paths byte-unchanged.
`pv validate` 0/0; standalone test passes; H2d.2 falsified.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
8f908be to
350cab9
Compare
5 tasks
noahgift
added a commit
that referenced
this pull request
May 7, 2026
…s decompose §27 1723% within rounding — fix scope EMPIRICALLY VALIDATED — spec v3.03.0 → v3.04.0 (#1546) Two-day autonomous /loop session shipped 11 lib-test + 1 integration-test falsifiers (M91-M101, aprender PRs #1535/#1536/#1537/#1538/#1540/#1541/ #1542/#1543/#1544/#1545) decomposing the §27 layer-3 ffn_swigl 18.23× APR-vs-GGUF std-ratio. Final empirical decomposition (2026-05-07): M94 mechanism × M95 compounding × M99 std-ratio × A5 real-teacher × residual = 0.077% × 5.70× × 50× × 5.56× × 14× ≈ 1715% ≈ §27's 1723% (within rounding) Six synthetic amplifier candidates resolved: - A1 (RoPE phase, M98) — FALSIFIED 1.00× UNITARY - A2 (Softmax saturation, M97) — FALSIFIED 0.01× COMPRESSES - A3 (Block-scale variance, M96) — FALSIFIED 1.00× SCALE-INVARIANT - A4 (Multi-token batch, M99) — FALSIFIED 0.26× per-token + 50× std-ratio - A5 (Real-weight non-uniformity, M100) — PARTIALLY CONFIRMED 5.56× LIVE - A6 (RMSNorm rsqrt, M101) — FALSIFIED 1.00× HOMOGENEOUS 14× residual is now attributed entirely to cumulative-layer interaction. SHIP-007 §22 fix scope EMPIRICALLY VALIDATED as Option-A (PROMOTE GGUF-PATH semantics into APR forward): switching APR's `f32_matmul` to Q8K activation quant + fused matvec semantics will recover the 5.56× per-matvec amplification on every matmul, eliminating cumulative APR-vs-GGUF drift. Estimated fix scope ~250-400 LOC; transitively discharges 5 MODEL-1 PARTIALs (SHIP-002, SHIP-005, SHIP-006, SHIP-007, SHIP-008) per §17.5. Cascade methodology consolidated: - ~/.claude/projects/-home-noah-src-aprender/memory/feedback_falsifier_cascade_decomposes_magnitude.md - ~/.claude/projects/-home-noah-src-aprender/memory/feedback_falsifier_chain_assert_difference.md Companion-spec entries M91-M101 in claude-code-parity-apr/docs/ specifications/claude-code-parity-apr-poc.md provide the full per-PR narrative. Aprender contract `contracts/trace-ffn-sub-block-gguf-v1.yaml` v1.0.0 → v1.12.0 across 12 amendments. MODEL-1 ship %: unchanged at 91% until M-FFN-GGUF-5 (actual fix PR) lands. MODEL-2 ship %: unchanged at 57% until step 5g.3 produces val_loss < 9.38. Spec v3.03.0 → v3.04.0. Atomic next action banner only — full §59 narrative deferred to deliberate-session work alongside M-FFN-GGUF-5 fix PR. Refs PMAT-CCPA, SHIP-007 §22, M91-M101 cascade. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
THIRD hypothesis falsification in one session. Authors
falsify_ffn_gguf_007_q4k_scalar_vs_simd_dequant_byte_identitytesting that APR's scalardequantize_q4_kand SIMDdequantize_q4_k_simdproduce byte-identical Vec for the same Q4K super-block bytes.Empirical result: both paths produce byte-identical output across all 256 elements (
element[0]=10.75=0x412c0000,element[255]=1.25=0x3fa00000). H2d.2 hypothesis FALSIFIED at the APR-internal dequant level.Three hypothesis falsifications this session
Remaining viable hypotheses (post-three-falsification)
Contract amendment (v1.3.0 → v1.4.0)
Test plan
pv validate0/0🤖 Generated with Claude Code