contract(tensor-names-v1): v1.0.0 → v1.1.0 — qwen3_moe coverage + F-TNV-002 falsifier by noahgift · Pull Request #1103 · paiml/aprender

noahgift · 2026-04-28T09:31:50Z

Five-whys analysis

Symptom: apr code -p '<prompt>' against Qwen3-Coder-30B-A3B-Instruct-Q4_K_M.gguf (17.3 GB) fails with:

Invalid shape: Tensor 'blk.0.ffn_up.weight' not found

Why	Answer
1	`tensor_names_fallback.rs:368` hardcodes the dense-FFN GGUF name `blk.{n}.ffn_up.weight` for the FfnUpWeight role, regardless of architecture.
2	The HuggingFace-side tensor naming branches on architecture, but the GGUF-side `_fallback` is a single architecture-agnostic string per role.
3	qwen3_moe stores per-expert weights as 3D tensors with different llama.cpp names (`ffn_gate_exps`, `ffn_up_exps`, `ffn_down_exps`) plus a router (`ffn_gate_inp`). None of these are `ffn_up`.
4	No falsification test asserted "for every architecture A, every required role R has a template that resolves against a representative .gguf for A". `pv validate` was passing a silently-incomplete contract.
5 (root)	The contract treated GGUF tensor naming as a flat fallback, not as an architecture-aware namespace. New architectures landed as code patches without paired contract gates.

Fix (contract-first, per realizar's CLAUDE.md "NEVER write code before writing a provable contract")

contracts/tensor-names-v1.yaml v1.0.0 → v1.1.0

metadata.five_whys_qwen3_moe_gap — full transcript embedded for audit
architecture_map: 6 new entries → qwen3_moe (Qwen3MoeForCausalLM, Qwen3MoEForCausalLM, Qwen3CoderForCausalLM, Qwen3_5MoeForCausalLM, qwen3_moe, qwen3moe)
layer_roles.ffn_gate_weight / ffn_up_weight / ffn_down_weight: added required_per_arch: { qwen3_moe: false } and templates.qwen3_moe: [] — dense-FFN expectations no longer fire on MoE
4 new layer roles for the MoE namespace:
- ffn_gate_inp_weight — router projection (hidden → experts)
- ffn_gate_exps_weight — per-expert gate (3D)
- ffn_up_exps_weight — per-expert up (3D)
- ffn_down_exps_weight — per-expert down (3D)
- Each carries templates.qwen3_moe + _fallback matching llama.cpp's actual GGUF names
New falsification_tests.F-TNV-002 predicting templates[qwen3_moe] resolves byte-for-byte against a real qwen3_moe.gguf header

crates/aprender-serve/src/tensor_names_fallback.rs

normalize_architecture extended to cover all 6 new architecture_map keys.

crates/aprender-serve/tests/qwen3_moe_tensor_inventory.rs (NEW)

4 F-TNV-002 falsification tests:

(a) qwen3_moe_architecture_keys_normalize_correctly — every HF class name routes to qwen3_moe
(b) dense_qwen3_unchanged_after_v1_1_0 — regression guard
(c) unknown_architecture_still_falls_back_to_llama — invariant from proof_obligations
(d) live_gguf_inventory_check_when_present — opens the real Qwen3-Coder-30B-A3B-Instruct-Q4_K_M.gguf and asserts the 4 load-bearing MoE tensor names appear byte-for-byte in the header. Skipped gracefully when the 17 GB file isn't present (so CI passes); runs locally after apr pull qwen3-coder.

What this PR does NOT do (intentionally)

Does not implement the MoE forward pass — that's a larger workstream.
Does not regenerate tensor_names_generated.rs from YAML — build.rs does that at compile time.

This PR's job is the contract + falsifier so future MoE-implementation work composes against a declarative spec rather than reverse-engineering llama.cpp.

Verification (all green locally)

$ pv validate contracts/tensor-names-v1.yaml
  0 error(s), 0 warning(s)
  Contract is valid.

$ pv lint contracts/tensor-names-v1.yaml
  Result: PASS

$ cargo test -p aprender-serve --test qwen3_moe_tensor_inventory
  test result: ok. 4 passed; 0 failed
  (incl. live GGUF inventory check against 17.3 GB Qwen3-Coder file)

🤖 Generated with Claude Code

…NV-002 falsifier Five-whys analysis of the Qwen3-Coder-30B-A3B-Instruct.gguf load failure: Symptom: `apr code -p '...'` against the 17.3 GB GGUF fails with "Invalid shape: Tensor 'blk.0.ffn_up.weight' not found". Why 1: tensor_names_fallback.rs:368 hardcodes the dense-FFN GGUF name `blk.{n}.ffn_up.weight` for the FfnUpWeight role, regardless of the model's `general.architecture` metadata. Why 2: The HuggingFace-side tensor naming branches on architecture (per-arch templates exist for llama, qwen2, qwen3, qwen3_moe…) but the GGUF-side `_fallback` is a single architecture- agnostic string per role. Why 3: qwen3_moe stores per-expert weights as 3D tensors with different llama.cpp names — `blk.{n}.ffn_gate_exps.weight`, `blk.{n}.ffn_up_exps.weight`, `blk.{n}.ffn_down_exps.weight`, plus a router `blk.{n}.ffn_gate_inp.weight`. None of these are `blk.{n}.ffn_up.weight`, so the lookup fails. Why 4: No falsification test in the contract framework asserted "for every architecture A in architecture_map, every required role R has at least one template that resolves against a representative .gguf file for A". Without that, `pv validate` passes a contract whose GGUF templates are silently incomplete. Why 5 (root cause): The contract treated "GGUF tensor naming" as a flat fallback, not as an architecture-aware namespace. Every new architecture lands as a code patch in tensor_names_fallback.rs without a paired contract gate. v1.1.0 adds qwen3_moe as a first-class architecture key with its own GGUF templates AND adds an F-TNV-002 falsification gate against a real qwen3_moe.gguf tensor inventory. What ships: contracts/tensor-names-v1.yaml (v1.0.0 → v1.1.0): - metadata.version 1.0.0 → 1.1.0; added `updated: 2026-04-28` - metadata.five_whys_qwen3_moe_gap full transcript embedded - architecture_map: 6 new entries pointing to qwen3_moe key (Qwen3MoeForCausalLM, Qwen3MoEForCausalLM, Qwen3CoderForCausalLM, Qwen3_5MoeForCausalLM, qwen3_moe, qwen3moe) - layer_roles.ffn_gate_weight / ffn_up_weight / ffn_down_weight: added `required_per_arch: { qwen3_moe: false }` and `templates.qwen3_moe: []` so dense-FFN expectations don't fire on MoE - 4 NEW layer_roles for the MoE namespace: ffn_gate_inp_weight — router projection (hidden → experts) ffn_gate_exps_weight — per-expert gate (3D) ffn_up_exps_weight — per-expert up (3D) ffn_down_exps_weight — per-expert down (3D) Each carries arch templates for qwen3_moe + a GGUF _fallback that matches llama.cpp's actual tensor names. - falsification_tests: new entry F-TNV-002 with the prediction "templates[qwen3_moe] for required MoE roles must resolve against a real qwen3_moe.gguf header byte-for-byte" + the cross-check command + a falsification oracle anchored to the Qwen3-Coder-30B-A3B-Instruct-Q4_K_M.gguf inventory captured by `apr inspect` on 2026-04-28. crates/aprender-serve/src/tensor_names_fallback.rs: - normalize_architecture: added cases for the 4 new HF class names that the contract architecture_map declares (Qwen3MoEForCausalLM uppercase MoE, Qwen3CoderForCausalLM, Qwen3_5MoeForCausalLM) plus lowercase canonical keys (qwen3_moe, qwen3moe). crates/aprender-serve/tests/qwen3_moe_tensor_inventory.rs (NEW, ~150 LOC): - 4 F-TNV-002 falsification tests: a) qwen3_moe_architecture_keys_normalize_correctly — every HF class name routes to "qwen3_moe" b) dense_qwen3_unchanged_after_v1_1_0 — regression guard: dense Qwen3 still maps to "qwen3" c) unknown_architecture_still_falls_back_to_llama — invariant from contract.proof_obligations d) live_gguf_inventory_check_when_present — opens the real Qwen3-Coder-30B-A3B-Instruct-Q4_K_M.gguf and asserts the 4 load-bearing MoE tensor names appear at byte level in the header (skipped gracefully when the 17 GB file isn't present in ~/.apr/models/, so CI doesn't fail; runs locally after `apr pull qwen3-coder`) What this PR does NOT do (intentionally): - This PR does NOT implement the MoE forward pass. Adding expert routing, per-expert dispatch, and weighted aggregation is a separate workstream. v1.1.0's job is the contract + falsifier so future implementation can compose against a declarative spec rather than reverse-engineering llama.cpp. - This PR does NOT regenerate `tensor_names_generated.rs` from the YAML — that's done by build.rs at compile time, and the F-TNV-002 falsifier in this PR works against the in-tree tensor_names_fallback.rs which is the source of truth when the YAML isn't present at build time. Verification (local, this PR): $ pv validate contracts/tensor-names-v1.yaml 0 error(s), 0 warning(s) Contract is valid. $ pv lint contracts/tensor-names-v1.yaml Result: PASS $ cargo test -p aprender-serve --test qwen3_moe_tensor_inventory test result: ok. 4 passed; 0 failed (incl. live GGUF inventory check) Refs: - Five-whys transcript embedded in contract metadata - tensor-names-v1.yaml § falsification_tests F-TNV-002 - Hugging Face: unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF (sha256 01b5fec0b9d789c2, 17.3 GB, downloaded via `apr pull qwen3-coder`) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Companion-side bookkeeping for the M29 cross-repo fix. The technical work itself shipped at paiml/aprender#1103 — five-whys analysis of `apr code` against Qwen3-Coder-30B-A3B-Instruct.gguf failing with "Tensor 'blk.0.ffn_up.weight' not found", traced to the contract treating GGUF tensor naming as a flat fallback rather than an architecture-aware namespace. Fix: contracts/tensor-names-v1.yaml v1.0.0 → v1.1.0 - 6 new architecture_map entries → qwen3_moe - dense FFN roles marked required_per_arch.qwen3_moe = false - 4 new MoE-specific layer roles - F-TNV-002 falsifier validated against the real 17.3 GB GGUF crates/aprender-serve/{src,tests}/...: - normalize_architecture extended for 6 new HF class names - 4 new falsification tests including a live-GGUF-inventory check Spec relevance: the M28 ccpa measure → apr code --emit-trace measurement path cannot produce a non-tautological FALSIFY-CCPA-013 discharge against tool-dispatching fixtures until apr-code can actually run a capable model. M28 + M29 are the two cleanly- separable enabling steps. Full MoE forward-pass implementation remains a separate larger workstream. Contract bump v1.16.0 → v1.17.0 with full five-whys transcript + the cross-repo fix narrative; aprender contract-mirror at byte- identical commit 499f8b978; pin.lock refreshed via the M22 4-step ritual. Gates (all green locally): pv validate / pv lint PASS pmat comply check (is_compliant) true, 0 Fail, 12 advisory Warn cargo test --workspace all pass (0 new tests companion-side) scripts/pin-check.sh sha256 matches scripts/pin-check-roundtrip.sh byte-identical to aprender@499f8b978 Refs: paiml/aprender#1103 (M29 upstream contract PR) contracts/claude-code-parity-apr-v1.yaml § status_history (M29) Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

10 files across 4 crates had accumulated rustfmt drift on main that was failing `cargo fmt --all -- --check` in CI for any new PR. Affected files (none touched in this PR's contract / qwen3_moe work): crates/aprender-core/src/format/ship_010.rs crates/aprender-core/src/format/v2/stamp.rs crates/aprender-gpu/src/kernels/backward/mod.rs crates/aprender-serve/src/gguf/inference/forward/traced.rs crates/aprender-serve/tests/qwen2_gqa_7_1_attention_parity.rs crates/aprender-train/src/autograd/cuda_backward/structured.rs crates/aprender-train/src/train/gputrain_006.rs crates/aprender-train/src/train/pretrain.rs crates/aprender-train/src/train/shard_reader.rs crates/aprender-train/tests/ship_two_001_const_pinning.rs Bundled here as the minimum-friction unblock for the qwen3_moe tensor-names contract PR's CI. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…oseout (#35) The companion-side spec markdown's milestone table stopped at M27. M28 (apr code --emit-trace + Qwen3-Coder default + qwen3-coder short-name alias) and M29 (five-whys + tensor-names-v1 v1.1.0 contract amendment + F-TNV-002 falsifier) both landed at aprender main but their narrative hadn't reached the spec. This PR closes that gap: - status snapshot bumped: M0–M30 all SHIPPED, contract v1.18.0 - new line on the M28+M29 cross-repo enabling chain - sub-milestones table extended through M30: M28 — cross-repo apr code --emit-trace + default model M29 — qwen3_moe contract amendment v1.1.0 + F-TNV-002 falsifier (paiml/aprender#1103 merged at 15d504cfe) M30 — this spec-table refresh (closeout) - outstanding next-goal reframed: MoE forward-pass implementation is the only piece remaining for a measured tool-dispatch parity score. That's realizar/aprender-serve engineering — not a CCPA POC scope item. The contract namespace, falsifier, model availability, and emit-trace plumbing are all in place. State at M30 close: - Companion-side spec POC: complete (M0–M30 all SHIPPED) - Aprender-side enabling chain (M28+M29): complete - Both repos byte-identical at sha256 7b1d79db710a91786033792a68b32a3cc7396472f7f7a61413c3e87728f88752 - 13/13 falsification gates green - Corpus complete (30/30 fixtures, 15/15 reachable) - 100% mutation coverage workspace-wide - Companion ↔ aprender drift guard mechanically enforced - Contributor onramp documented (CONTRIBUTING.md) - Cross-repo audit trail intact across status_history Contract bump v1.17.0 → v1.18.0 with the M30 status_history entry documenting the doc closeout. Aprender mirror pushed in paired commit b7f42619d. pin.lock refreshed via the M22 4-step ritual. Gates (all green locally): pv validate / pv lint PASS pmat comply check (is_compliant) true, 0 Fail, 12 advisory Warn cargo test --workspace all pass (0 new tests) scripts/pin-check.sh sha256 matches scripts/pin-check-roundtrip.sh byte-identical to aprender@b7f42619d Refs: paiml/aprender#1103 (M29 contract — merged 15d504cfe) contracts/claude-code-parity-apr-v1.yaml § status_history (M30) Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

…oE forward gap M32a — first slice of the MoE forward-pass implementation chain that the companion claude-code-parity-apr POC named as the "Outstanding next-goal (in-scope, M32)" in v1.19.0 (M31 spec). WHY THIS CONTRACT EXISTS ======================== `apr run <qwen3-coder>.gguf` currently fails with: Invalid shape: Tensor 'blk.0.ffn_up.weight' not found at the FFN load step. The M29 contract amendment (tensor-names-v1 v1.1.0, #1103) declared the qwen3_moe tensor namespace but explicitly deferred the forward-pass implementation. This contract discharges that deferral with a 4-stage staged plan. WHAT THIS PR SHIPS ================== A KernelContract `qwen3-moe-forward-v1.yaml` (DRAFT status) that: * Composes existing kernels: tensor-names-v1 v1.1.0 + moe-router-v1 + moe-expert-dispatch-v1 + qwen3moe-shapes-v1 + swiglu-kernel-v1 + silu-kernel-v1 + rmsnorm-kernel-v1 + rope-kernel-v1 * Names 5 acceptance criteria (AC_QW3_MOE_001 .. _005) * Names 4 implementation stages (M32a SHIPPED, M32b/c/d PENDING) * Names 4 falsification tests (F-QW3-MOE-FORWARD-001 REPRODUCED at commit 15d504c = end of M29; the other three are PENDING and each maps to one stage) * Names the Qwen3-Coder-30B-A3B-Instruct shape algebra explicitly (L=48, d=2048, d_ff=6144, N_experts=128, k=8, n_heads=32, n_kv=4, vocab=151936, RoPE θ=1e7) so the contract is testable on the live cached GGUF (~/.cache/pacha/models/2b88b180a790988f.gguf, 17.3 GB) WHAT M32b/c/d WILL SHIP (in subsequent PRs) ============================================ M32b: Architecture-aware FFN load. Branch transformer_loader.rs (line ~145) on tensor_names_fallback::normalize_architecture(...). For arch == "qwen3_moe", load the 4 contract-named tensors per layer (ffn_gate_inp/ffn_gate_exps/ffn_up_exps/ffn_down_exps) into a new MoeLayerWeights field. Forward emits structured UnsupportedOperation containing this contract's id. M32c: Wire CPU MoE forward. The pure-Rust moe_forward_token in gpu/scheduler/moe_dispatch.rs already implements the full router + per-expert SwiGLU + weighted aggregation kernel. Populate MoeExpertWeights from M32b-loaded tensors and call it from the FFN dispatch site. After M32c, `apr run` emits tokens. M32d: Numerical parity vs llama.cpp Q4_K (primary) + HF FP16 (secondary) per CLAUDE.md ground-truth checklist. Discharges AC_QW3_MOE_001 and AC_QW3_MOE_005. Flips this contract from DRAFT to ACTIVE_RUNTIME and unblocks companion-repo FALSIFY-CCPA-013 measured tool-dispatch parity score. CROSS-REPO LINKS ================ This contract is the aprender-side spine of: * paiml/claude-code-parity-apr v1.19.0 (M31 spec, 2026-04-28) — "Outstanding next-goal (in-scope, M32)" was created exactly for this 4-stage plan; the user clarified at M31 that aprender and claude-code-parity-apr are the same monorepo, so this work IS in-scope companion-repo work, not "upstream realizar engineering" * paiml/aprender contracts/tensor-names-v1.yaml v1.1.0 (M29) — declared the namespace this contract operates over VALIDATION ========== $ pv validate contracts/qwen3-moe-forward-v1.yaml 0 error(s), 0 warning(s) Contract is valid. NO CODE CHANGE in this PR. M32a is contract-only by design; M32b is where Rust changes start. Authoring contract before code per CLAUDE.md rule 1 (CB-1400). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

noahgift force-pushed the feat/qwen3-moe-tensor-names-contract branch from 886c5bc to 222b6d6 Compare April 28, 2026 10:00

noahgift mentioned this pull request Apr 28, 2026

M29: record five-whys + tensor-names-v1 v1.1.0 contract amendment paiml/claude-code-parity-apr#34

Merged

noahgift force-pushed the feat/qwen3-moe-tensor-names-contract branch from 222b6d6 to c96c637 Compare April 28, 2026 10:05

noahgift force-pushed the feat/qwen3-moe-tensor-names-contract branch from c96c637 to 6239d8b Compare April 28, 2026 10:15

noahgift force-pushed the feat/qwen3-moe-tensor-names-contract branch from 6239d8b to 7727ad7 Compare April 28, 2026 10:37

noahgift merged commit 15d504c into main Apr 28, 2026
10 checks passed

noahgift deleted the feat/qwen3-moe-tensor-names-contract branch April 28, 2026 10:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

contract(tensor-names-v1): v1.0.0 → v1.1.0 — qwen3_moe coverage + F-TNV-002 falsifier#1103

contract(tensor-names-v1): v1.0.0 → v1.1.0 — qwen3_moe coverage + F-TNV-002 falsifier#1103
noahgift merged 2 commits into
mainfrom
feat/qwen3-moe-tensor-names-contract

noahgift commented Apr 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

noahgift commented Apr 28, 2026

Five-whys analysis

Fix (contract-first, per realizar's CLAUDE.md "NEVER write code before writing a provable contract")

contracts/tensor-names-v1.yaml v1.0.0 → v1.1.0

crates/aprender-serve/src/tensor_names_fallback.rs

crates/aprender-serve/tests/qwen3_moe_tensor_inventory.rs (NEW)

What this PR does NOT do (intentionally)

Verification (all green locally)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant