contract(qwen3-moe-forward-v1): v1.2.0 → v1.3.0 — M32d.0 parity strategy decision#1128
Merged
Conversation
…egy decision ## Why The original FALSIFY-QW3-MOE-FORWARD-004 test description references `llama-cli ... --verbose --logit-output` as the reference path for full-vector logit dump. **That flag does not exist** in stock llama.cpp v7746 (`llama-cli --help` exposes `--logit-bias` but no logit-output flag). M32d cannot proceed without picking an actually-existing reference path. This amendment records the decision so downstream M32d.1-.4 sub-slices have a stable target. ## Decision PRIMARY (full-vector cosine > 0.99): HF transformers FP16 via uv-run Python — `uv run --with torch --with transformers scripts/generate_qwen3_moe_fp16_logits.py`. Dumps the full 151936-dim logit vector at position 0 to a JSON fixture. SECONDARY (top-1 argmax sanity): llama.cpp via subprocess — runs `llama-cli -m <gguf> -p ... -n 1 --top-k 1 --temp 0.0 --seed 0` and parses stdout to extract the single emitted token; asserts apr's argmax == llama.cpp's top-1. REJECTED: patched llama.cpp full-vector logit dump (out-of-tree patch maintenance burden too high for the discharge value). ## What ships 1. `metadata.amendment_history` v1.3.0 entry recording the three reference paths considered, the decision, the rationale, and the M32d sub-slice plan (.0–.4). 2. Top-level `version` 1.2.0 → 1.3.0; status comment expanded to note the v1.3.0 strategy decision and the missing llama-cli flag. 3. `implementation_stages` M32d entry rewritten to enumerate M32d.0–M32d.4 sub-slices with concrete deliverables per slice. 4. `falsification_tests` FALSIFY-QW3-MOE-FORWARD-004 rewritten: - `prediction:` two-axis (cosine vs HF FP16 + argmax vs llama.cpp) - `test:` references new fixture path + subprocess invocation - `if_fails:` 3-step diagnostic order keyed off which axis fails 5. `cross_repo_references[paiml/claude-code-parity-apr].current_version` bumped 1.19.0 → 1.20.0 (companion repo bumped this morning at PR #38 squash 5aff6d5). ## What's NOT in this PR - The fixture-generation script (M32d.1, next slice). - The parity tests themselves (M32d.2, M32d.3). - The DRAFT → ACTIVE_RUNTIME flip (M32d.4). ## Test plan - [x] `pv validate contracts/qwen3-moe-forward-v1.yaml` — 0 errors, 0 warnings - [ ] CI ci/gate green - [ ] Workspace test green 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
4 tasks
noahgift
added a commit
that referenced
this pull request
May 1, 2026
M33 audit-trail bump on companion side. Records: * #1127 (M32c.2.2.2.1.4) live regression test on aprender main * #1128 #1129 #1130 #1131 (M32d.0/.1/.2/.3) parity scaffolding No code change beyond this contract mirror. M22 4-step ritual: mirror push (this commit) → companion pin.lock refresh → companion spec PR. Contract sha256 f4ea18b1acaea56ef8ef40fc857e5057e06e0627232be5b248dad6389b68e846 byte-identical with companion side. Refs: claude-code-parity-apr-v1 § companion_repo.contract_pin
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Records the M32d parity strategy decision in
qwen3-moe-forward-v1amendment history. No code change; contract amendment only.
The original FALSIFY-QW3-MOE-FORWARD-004 referenced
llama-cli ... --verbose --logit-outputas the reference pathfor full-vector logit dump. That flag does not exist in stock
llama.cpp v7746. M32d cannot proceed without picking an
actually-existing reference path.
Decision
PRIMARY (full-vector cosine > 0.99 per AC_QW3_MOE_005):
HuggingFace transformers FP16 via
uv run --with torch --with transformersrunning
scripts/generate_qwen3_moe_fp16_logits.py. Dumps thefull 151936-dim logit vector at position 0 for prompt
"What is 2+2?" to a JSON fixture.
SECONDARY (top-1 argmax sanity):
llama-cli -m <gguf> -p "..." -n 1 --top-k 1 --temp 0.0 --seed 0as subprocess; asserts apr's argmax == llama.cpp's top-1 token.
REJECTED: patched llama.cpp full-vector logit dump
(out-of-tree patch maintenance burden too high).
M32d sub-slice plan (each its own PR)
scripts/generate_qwen3_moe_fp16_logits.py+ one-time fixture generation on lambda-vectorf_qw3_moe_parity_001_cosine_vs_hf_fp16test (primary gate)f_qw3_moe_parity_002_argmax_vs_llama_cpptest (secondary sanity)Test plan
pv validate contracts/qwen3-moe-forward-v1.yaml— 0 errors, 0 warnings🤖 Generated with Claude Code