Skip to content

M-GPU-MOE-2.x — wgpu helpers + integration + parity test for qwen3-moe-forward-gpu-v1 #1582

@noahgift

Description

@noahgift

Context

After the M-GPU-MOE-1.x cascade closure (M51-M85, see paiml/claude-code-parity-apr companion-repo spec § Sub-extension 2 status as of M85-M87), the wgpu sibling path remains incomplete.

Stub merged at M54 (#1485 squash 5a27bb892, 3-commit bundle including OwnedQuantizedModelWgpu at crates/aprender-serve/src/gguf/wgpu_backend/mod.rs).

Open work

  • M-GPU-MOE-2.1: expert_swiglu_wgpu + moe_ffn_forward_layer_wgpu helpers — blocked on trueno-gpu wgpu surface authoring (QuantizeKernel + GemmKernel compute pipelines).
  • M-GPU-MOE-2.2: full forward integration mirroring forward_qwen3_moe_cuda (M51 feat(aprender-serve): forward_qwen3_moe_cuda full integration — M-GPU-MOE-1.1.2 #1477 squash dc6f94d3b).
  • M-GPU-MOE-2.3: heavy --include-ignored cosine ≥ 0.99 vs CPU LAZY-FUSED-MATVEC parity test (FALSIFY-QW3-MOE-GPU-PARITY-001 wgpu sibling).

Test scaffold already authored

Acceptance

Heavy --include-ignored runs PASS on:

  • Apple Silicon Metal
  • AMD Vulkan
  • Intel ARC

Cross-refs

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions