Skip to content

contract(qwen3-moe-forward-v1): v1.4.0 → v1.5.0 ACTIVE_RUNTIME — F-QW3-MOE-PARITY-001 DISCHARGED (cos 0.995384)#1597

Merged
noahgift merged 1 commit into
mainfrom
m32d-active-runtime-flip-v1-5-0
May 9, 2026
Merged

contract(qwen3-moe-forward-v1): v1.4.0 → v1.5.0 ACTIVE_RUNTIME — F-QW3-MOE-PARITY-001 DISCHARGED (cos 0.995384)#1597
noahgift merged 1 commit into
mainfrom
m32d-active-runtime-flip-v1-5-0

Conversation

@noahgift

@noahgift noahgift commented May 9, 2026

Copy link
Copy Markdown
Contributor

Summary

Promotes qwen3-moe-forward-v1 v1.4.0 ACTIVE_ALGORITHM_LEVEL → v1.5.0 ACTIVE_RUNTIME based on the live discharge of FALSIFY-QW3-MOE-FORWARD-004 axis (a) measured 2026-05-09 on lambda-vector RTX 4090.

Closes #1584.

Live discharge evidence

cargo test --release -p aprender-serve --test qwen3_moe_parity -- \
  --include-ignored f_qw3_moe_parity_001_cosine_vs_hf_fp16

F-QW3-MOE-PARITY-001:
  elapsed       = 555.52868ms
  cos_sim       = 0.995384
  threshold     = 0.99
  apr_argmax    = 3555 (val = 22.3671)
  hf_argmax     = 3555 (" What")

test result: ok. 4 passed; 0 failed; 0 ignored
  • cos_sim 0.995384 ≥ 0.99 by margin 0.0054
  • apr_argmax = hf_argmax = 3555 (" What") — exact agreement (no near-tie hypothesis needed)
  • APR forward elapsed: 555ms on 7-token prompt "What is 2+2?"

What lands

File Change
contracts/qwen3-moe-forward-v1.yaml metadata.version 1.4.0 → 1.5.0; top-level version + status flipped; new amendment_history entry; M32d implementation_stage PENDING → DISCHARGED
crates/aprender-serve/tests/fixtures/qwen3_moe_fp16_logits_pos0.json NEW (2.06 MiB) — 151936-dim FP32 logit vector at position 6 for prompt "What is 2+2?". Generated 2026-05-09 in 52s wall on lambda-vector. Committed verbatim per script docstring.

Five-whys for "60 GB HF download was a stale claim"

(Full text in commit body)

The "operator-confirm pending ~60 GB HF download" claim in v1.4.0 / R9 / #1584 was stale by ~62 days — the FP16 weights had been on lambda-vector at /mnt/nvme-raid0/models/Qwen3-Coder-30B-A3B-Instruct/ since well before v1.4.0 was authored. Companion-repo M109 (paiml/claude-code-parity-apr#95) discovered this simply via find /mnt -name "*Qwen3-Coder-30B*" at the start of the discharge session.

Future M110-class kaizen opportunity in companion repo: detector for mechanically-checkable "pending" claims (the existing 12-assert detector doesn't query filesystem state).

F-QW3-MOE-PARITY-002 sibling status

Deferred — CPU-only llama-cli on Qwen3-Coder-30B-A3B-Instruct hung at 99.9% single-CPU for 2 hours without producing output even with -ngl 999 GPU-offload flag (suspected MoE expert dispatch is CPU-bound in llama.cpp build #7746 — upstream issue). Not load-bearing for ACTIVE_RUNTIME flip because axis (a) directly proved apr_argmax = hf_argmax. Test stays #[ignore]-gated for regression coverage.

Cross-repo refs

Test plan

  • pv validate contracts/qwen3-moe-forward-v1.yaml — 0 errors / 0 warnings
  • F-QW3-MOE-PARITY-001 cosine test — PASS at cos 0.995384
  • HF FP16 fixture committed at canonical path
  • apr forward — 555ms on 7-token prefill (no regression vs v1.4.0)
  • CI green (in flight)

🤖 Generated with Claude Code

…3-MOE-PARITY-001 DISCHARGED

Closes #1584.

## Discharge summary

Live test result on lambda-vector RTX 4090 (2026-05-09):

  cargo test --release -p aprender-serve --test qwen3_moe_parity -- \
    --include-ignored f_qw3_moe_parity_001_cosine_vs_hf_fp16

  F-QW3-MOE-PARITY-001:
    elapsed       = 555.52868ms
    cos_sim       = 0.995384
    threshold     = 0.99
    apr_argmax    = 3555 (val = 22.3671)
    hf_argmax     = 3555 (" What")

  test result: ok. 4 passed; 0 failed; 0 ignored

cos_sim 0.995384 ≥ 0.99 by margin 0.0054. apr_argmax = hf_argmax = 3555
(exact agreement, no near-tie hypothesis needed).

## Five-whys for "60 GB HF download was stale claim"

(1) v1.4.0 (2026-05-02) elevated R9 to "operator-confirm pending ~60 GB
    HF download". Issue #1584 (filed 2026-05-09 from companion repo)
    inherited the stale claim.
(2) Companion M109 (2026-05-09) discovered FP16 weights had been on
    lambda-vector at /mnt/nvme-raid0/models/Qwen3-Coder-30B-A3B-Instruct/
    (57 GB across 16 safetensors shards, Mar 8 timestamps) for ~62 days.
(3) The "operator-confirm pending X" claim aged silently because no
    detector class checks "is X-pending claim still mechanically true?"
    Companion's 12-assert detector covers M-count / gate-count /
    contract-version / fixture-count / status-anchor consistency, but
    not filesystem-state-of-pending-claim.
(4) Future M110-class kaizen on companion side: detector for
    mechanically-checkable "pending" claims.
(5) Root cause: kaizen blind-spot for filesystem-state drift. M109
    leaves M110 as a future detector extension.

## Edits

- `contracts/qwen3-moe-forward-v1.yaml`:
  - metadata.version: 1.4.0 → 1.5.0
  - top-level version: "1.4.0" → "1.5.0"
  - status: ACTIVE_ALGORITHM_LEVEL → ACTIVE_RUNTIME (with new comment block
    citing M109 discharge evidence)
  - amendment_history: prepended v1.5.0 entry with full discharge
    evidence + reproducibility recipe + cross-repo refs
  - implementation_stages.M32d: PENDING → DISCHARGED with discharge
    evidence in description

- `crates/aprender-serve/tests/fixtures/qwen3_moe_fp16_logits_pos0.json`:
  NEW (2.06 MiB) — 151936-dim FP32 logit vector for prompt "What is 2+2?"
  at position 6 (end-of-prompt). Generated 2026-05-09 in 52s wall via
  uv run --with torch --with transformers --with accelerate
  scripts/generate_qwen3_moe_fp16_logits.py
  --model /mnt/nvme-raid0/models/Qwen3-Coder-30B-A3B-Instruct
  --output crates/aprender-serve/tests/fixtures/...
  Committed verbatim per script docstring "this fixture is captured
  once and committed" — makes the discharge reproducible.

## F-QW3-MOE-PARITY-002 sibling status

Deferred — CPU-only `llama-cli` on Qwen3-Coder-30B-A3B-Instruct hung
at 99.9% single-CPU for 2 hrs without producing output even with
`-ngl 999` GPU-offload flag (suspected MoE expert dispatch is
CPU-bound in llama.cpp build #7746 — upstream issue, not aprender-side).

Not load-bearing for ACTIVE_RUNTIME flip: axis (a) directly proved
apr_argmax = hf_argmax = 3555. Test stays #[ignore]-gated for
regression coverage when llama.cpp's MoE-on-CPU performance improves.

## Cross-repo refs

- Companion-repo M109 milestone: paiml/claude-code-parity-apr#95
  squash 9c2833334 (2026-05-09T15:02:33Z)
- Companion-repo M108 ticketing: paiml/claude-code-parity-apr#94
  (filed #1584 with stale "60 GB pending" claim — corrected by M109
  same day)

## Verification

- pv validate contracts/qwen3-moe-forward-v1.yaml: 0 errors / 0 warnings
- F-QW3-MOE-PARITY-001 cosine test: PASS at cos 0.995384
- HF FP16 fixture: 2.06 MiB committed at canonical path
- apr forward: 555ms on 7-token prefill (no regression vs v1.4.0)

Refs PMAT-CODE-QWEN3-MOE-PARITY-FLIP-001.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@noahgift noahgift enabled auto-merge (squash) May 9, 2026 20:51
@noahgift noahgift merged commit 3fb04ef into main May 9, 2026
11 checks passed
@noahgift noahgift deleted the m32d-active-runtime-flip-v1-5-0 branch May 9, 2026 21:19
noahgift added a commit that referenced this pull request May 10, 2026
… main + M109 discharge integrated (#1613)

## Summary

Authors `contracts/claude-code-parity-apr-v1.yaml` v1.24.0 directly on aprender main as the FIRST canonical landing of this contract. Replaces the closed PR #1078 (M0 mirror, closed 2026-05-10 due to a workspace-test failure on the rebased branch unrelated to contract content).

v1.24.0 amendments to the v1.23.0 baseline:

1. Status-prose at line 67: "Cosine vs HF FP16 ... operator-confirm pending ~60 GB HF download" → "DISCHARGED 2026-05-09 at companion-repo M109 (cos_sim 0.995384, lambda-vector RTX 4090)".
2. "What is NOT in this discharge" list item at line 808: cosine measurement now DISCHARGED, with cross-references to aprender PR #1597 squash 3fb04ef (v1.4.0 → v1.5.0 ACTIVE_RUNTIME flip).
3. Inline narrative at line 888: "~60 GB HF download" claim annotated as stale by 62 days; FP16 weights had been on lambda-vector at /mnt/nvme-raid0/models/ for ~7 days.
4. New v1.23.0 → v1.24.0 status_history entry recording the discharge evidence.

## Why this lived as "PR-pinned canonical" until v1.24.0

The v1.23.0 contract was authored on aprender PR #1078 (M0 mirror PR, never merged to main). Companion-repo M130 identified that the contract did NOT exist on aprender main — only on PR #1078's feature branch. PR #1078 closed 2026-05-10 (companion-repo M131) due to a workspace-test failure unrelated to contract content (`agent::auto_memory::tests::root_uses_config_dir_when_env_unset` — pre-existing aprender-side flake on the rebased state).

v1.24.0 is authored fresh from aprender main, removing the "PR-pinned canonical" anomaly.

## Companion-repo follow-up

After this PR merges, the companion repo will refresh `contracts/pin.lock` with the squash commit hash + content sha256 and execute the M22 5-step ritual (4 cross-reference surface bumps + new M-row).

## Verification

- `pv validate contracts/claude-code-parity-apr-v1.yaml` → 0 errors, 0 warnings
- Contract is byte-identical to companion's v1.23.0 except for the v1.24.0 amendments listed above

No falsification gates added or modified. 13/13 gates remain green; 30/30 fixtures remain at aggregate parity 1.0000.

Refs PMAT-037, paiml/claude-code-parity-apr#117, #1597 (M109 discharge).

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

qwen3-moe-forward-v1 ACTIVE_RUNTIME flip — operator-confirm cosine ≥ 0.99 vs HF FP16 reference (~60 GB download)

1 participant