Skip to content

contract(qwen3-moe-forward-gpu-v1): v1.6.0 → v1.7.0 — DRAFT → ACTIVE_ALGORITHM_LEVEL#1530

Merged
noahgift merged 1 commit into
mainfrom
contract/qwen3-moe-gpu-v1.7.0-active-algorithm-level
May 6, 2026
Merged

contract(qwen3-moe-forward-gpu-v1): v1.6.0 → v1.7.0 — DRAFT → ACTIVE_ALGORITHM_LEVEL#1530
noahgift merged 1 commit into
mainfrom
contract/qwen3-moe-gpu-v1.7.0-active-algorithm-level

Conversation

@noahgift

@noahgift noahgift commented May 6, 2026

Copy link
Copy Markdown
Contributor

Summary

Status promotion amendment after the M-GPU-MOE-1.4 step (c) cascade closure (v1.6.0 / aprender PR #1529 squash 89cb26af7).

What flips

Field Was Now
metadata.status DRAFT ACTIVE_ALGORITHM_LEVEL
metadata.status comment "Scaffold + architecture amendments + preload-bug fix" (stale) "1.x cascade DISCHARGED — wgpu (2) + throughput (3) PENDING"
M-GPU-MOE-1 implementation_stage PENDING SHIPPED (umbrella covers 1.0 → 1.4 step c)

Why ACTIVE_ALGORITHM_LEVEL not ACTIVE_RUNTIME

Mirrors CPU sibling qwen3-moe-forward-v1 cadence — ALGORITHM_LEVEL = "algorithm bound on main; finite output for canonical prompt". ACTIVE_RUNTIME flip waits on M-GPU-MOE-3 (throughput ≥150 tok/s + memory budget) per original v1.0 contract convention.

Per-AC status

Acceptance Criterion Status Notes
AC_GPU_MOE_001 (cosine ≥0.99 vs CPU) ALGORITHM_LEVEL_DISCHARGED ~85% layers ≥0.99; ~7-8 at 0.94-0.987 (fp accumulator order)
AC_GPU_MOE_002 (cosine ≥0.99 vs HF FP16) blocked on fixture M32d.1 operator-confirm pending
AC_GPU_MOE_003 (top-5 recovery) pending heavy re-run
AC_GPU_MOE_004 (output finiteness) DISCHARGED M85 qtype-aware dispatch fix
AC_GPU_MOE_005 (deterministic) ALGORITHM_LEVEL_DISCHARGED
AC_GPU_MOE_006 (throughput ≥150 tok/s) PENDING M-GPU-MOE-3
AC_GPU_MOE_007 (VRAM ≤95%) PENDING M-GPU-MOE-3

4/5 algorithm-bound + 1 fixture-blocked → ACTIVE_ALGORITHM_LEVEL threshold crossed.

Sub-cascade pinned in M-GPU-MOE-1 SHIPPED

What stays PENDING

  • M-GPU-MOE-2 (wgpu fallback) — blocked on trueno-gpu wgpu surface
  • M-GPU-MOE-3 (throughput) — kernel-level fp-order alignment

Test plan

  • pv validate 0/0
  • No production code touched (YAML-only)
  • Cross-references all M-GPU-MOE-1.x SHIPPED PRs by number + squash

🤖 Generated with Claude Code

…ALGORITHM_LEVEL post 1.x cascade

Status promotion amendment after the M-GPU-MOE-1.4 step (c) cascade
closure (v1.6.0 / aprender PR #1529).

What flips:
- metadata.status: DRAFT → ACTIVE_ALGORITHM_LEVEL
- M-GPU-MOE-1 implementation_stage (umbrella): PENDING → SHIPPED
  (covers full 1.x sub-cascade 1.0 → 1.4 step c)
- metadata.status comment refreshed (was stale "Scaffold +
  architecture amendments + preload-bug fix")

Why ACTIVE_ALGORITHM_LEVEL not ACTIVE_RUNTIME:
Mirrors CPU sibling qwen3-moe-forward-v1 cadence — ALGORITHM_LEVEL
= "algorithm bound on main; finite output for canonical prompt".
RUNTIME flip waits on M-GPU-MOE-3 (throughput ≥150 tok/s + memory
budget) per original v1.0 contract convention.

Per-AC status:
- AC_GPU_MOE_001 (cosine ≥0.99 vs CPU): ALGORITHM_LEVEL_DISCHARGED
- AC_GPU_MOE_002 (cosine ≥0.99 vs HF FP16): blocked on fixture
- AC_GPU_MOE_003 (top-5 token recovery): pending heavy re-run
- AC_GPU_MOE_004 (output finiteness): DISCHARGED (M85)
- AC_GPU_MOE_005 (deterministic per-token): ALGORITHM_LEVEL_DISCHARGED
- AC_GPU_MOE_006 (throughput ≥150 tok/s): PENDING M-GPU-MOE-3
- AC_GPU_MOE_007 (VRAM ≤95%): PENDING M-GPU-MOE-3

YAML-only — production hot paths byte-unchanged.

`pv validate` 0/0.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@noahgift noahgift enabled auto-merge (squash) May 6, 2026 08:51
@noahgift noahgift merged commit 65bc425 into main May 6, 2026
11 checks passed
@noahgift noahgift deleted the contract/qwen3-moe-gpu-v1.7.0-active-algorithm-level branch May 6, 2026 09:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant