contract(apr-cpu-vs-gpu-output-parity-v1): v1.2.0 — FALSIFY-CPU-GPU-005 wgpu visibility + parity-gate entry by noahgift · Pull Request #1430 · paiml/aprender

noahgift · 2026-05-03T15:37:36Z

Summary

Adds FALSIFY-CPU-GPU-005 to apr-cpu-vs-gpu-output-parity-v1 at PARTIAL_ALGORITHM_LEVEL: wgpu fallback must (a) log lifecycle visibly without --verbose AND (b) run a parity gate analogous to CUDA's.
Lands (a) wgpu visibility fix immediately — symmetric to fix(apr-cpu-vs-gpu-output-parity-v1): make CUDA fallback decision visible without --verbose (v1.0→v1.1, PROPOSED→ACTIVE) #1428's CUDA fix. Users will now see Backend: wgpu (Vulkan) and [GH-559] wgpu init failed: ... on stderr without --verbose.
Defers (b) wgpu cosine-similarity gate to follow-up PR (~100-150 LOC, needs a small refactor to expose wgpu single-step decode).
Bumps contract v1.1.0 → v1.2.0 (additive; no existing falsifier semantics changed).

Why this PR

Per Toyota Way + the v1.1.0 contract follow-up I flagged: when the CUDA path is rejected, fall-through hits wgpu, which today produces independent gibberish on the canonical 7B teacher. PR #1428 made CUDA's rejection visible but wgpu still serves silent garbage. This PR closes the visibility half of the loophole + binds the parity-gate half so it doesn't get lost.

Test plan

pv validate contracts/apr-cpu-vs-gpu-output-parity-v1.yaml — 0 errors
cargo build -p aprender-serve --features cuda --release — clean
cargo test -p aprender-serve --features cuda --lib cuda_fallback_log_prefix_is_contract_tagged — 1 passed (drift-prevention from test(apr-cpu-vs-gpu-output-parity-v1): drift-prevention for CUDA fallback log tag #1429 still green)
Live smoke after release rebuild on lambda-labs canonical 7B teacher: stderr should now include [apr-cpu-vs-gpu-output-parity-v1] CUDA path rejected, ... AND Backend: wgpu (Vulkan) without -v

Out of scope (follow-up tracked in v1.2.0 algorithm_evidence)

Implement the wgpu cosine gate at try_apr_wgpu_inference's init point — extract per-token decode body into a callable, run one CPU+wgpu BOS forward, cosine-compare logits, return None on < 0.99.

🤖 Generated with Claude Code

…CPU-GPU-005 wgpu visibility + parity-gate entry Closes the second half of the silent-fallback loophole. After PR #1428 made CUDA rejection visible (FALSIFY-CPU-GPU-003), the wgpu fallback path still ships gibberish silently for the canonical 7B teacher because (a) its init and "Backend: wgpu (Vulkan)" logs are verbose-gated and (b) it has no parity_gate analog to CUDA's. This PR lands the (a) visibility fix immediately and binds the (b) parity- gate at PARTIAL_ALGORITHM_LEVEL pending a follow-up implementation (~100-150 LOC, requires extracting the per-token wgpu decode loop body into a callable single-step function). Five Whys: 1. Why ship visibility before the parity gate? Visibility is one-line, low- risk, and immediately useful — users now see "Backend: wgpu (Vulkan)" on stderr without --verbose, so they know which backend is serving their tokens after CUDA falls through. 2. Why not full gate now? wgpu's existing API doesn't expose a single-step forward; adding one means refactoring the autoregressive loop body. Doable but bigger PR — keep this one bounded. 3. Why bump v1.1.0 → v1.2.0 not v2.0.0? FALSIFY-CPU-GPU-005 is additive; no existing falsifier semantics changed. Minor bump per semver. Code: gguf_gpu_generate.rs:23-32 (try_wgpu_generate) and 311-326 (try_apr_wgpu_inference) drop `if verbose { ... }` from the wgpu init/ Backend log lines. Verification: - pv validate contracts/apr-cpu-vs-gpu-output-parity-v1.yaml → 0 errors - cargo build -p aprender-serve --features cuda --release → clean - cargo test -p aprender-serve --features cuda --lib cuda_fallback_log_prefix_is_contract_tagged → 1 passed (existing drift-prevention from PR #1429 still green) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…vs-gpu-output-parity-v1` chain (PRs #1427-#1430) (#1431) Canonical record of today's 4-PR session that lands a 3-layer jidoka armor at the GPU-CPU dispatch boundary, closing §40's silent-gibberish loophole *as a regression class* (separate from the underlying GPU kernel fix). Net effect: default `apr run <model.apr>` on a SHIP-007-broken GPU build now emits the full backend-fallback chain on stderr without `--verbose`: [apr-cpu-vs-gpu-output-parity-v1] CUDA path rejected, attempting fallback: ... Backend: wgpu (Vulkan) ... so users always know which backend is actually serving their tokens — the `--no-gpu` documented workaround is now self-evidently the correct path on this build. MODEL-1 ship % nudges 80% → 87% because shipping `apr run` users with the documented workaround is now jidoka-safe. This section explicitly does NOT claim the SHIP-007 kernel bug is fixed — that remains an open track per §40. What §41 codifies is that the failure mode is now LOUD instead of SILENT. Five Whys + table of all 4 PRs + next-session pickup list (FALSIFY-CPU-GPU-005 part b OR MODEL-2 distill-train scaffolding) included. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

… drift-prevention tests (#1435) Closes the contract drift between v1.2.0's prediction and the actual code: contract `apr-cpu-vs-gpu-output-parity-v1` v1.2.0 (PR #1430) said the wgpu rejection log should emit `[apr-cpu-vs-gpu-output-parity-v1] wgpu path rejected, attempting fallback: ...` symmetric to the CUDA tag, but #1430 only made the existing `[GH-559]`/`Backend:` logs unconditional — the contract-tagged wgpu rejection log itself was missing. Mirror of the FALSIFY-CPU-GPU-003 chain (#1428 visibility + #1429 drift test) for FALSIFY-CPU-GPU-005: - Adds `pub(crate) const WGPU_FALLBACK_LOG_PREFIX = "[apr-cpu-vs-gpu-output-parity-v1] wgpu path rejected"` - Updates `try_apr_wgpu_inference` to emit it on `GpuDevice::new()` failure (alongside the existing `[GH-559]` runbook tag — both, not either) - 3 new unit tests: * `wgpu_fallback_log_prefix_is_contract_tagged` (symmetric to the CUDA test) * `cuda_and_wgpu_fallback_log_prefixes_share_contract_tag` (symmetry guard: both prefixes must start with the same contract ID and end with "path rejected" so grep recipes work uniformly across backends) Five Whys: 1. Why was this gap in v1.2.0? PR #1430 conflated "make existing wgpu log visible" (done) with "add contract-tagged rejection log" (deferred). The contract's prediction text wrote the second; the code shipped only the first. 2. Why catch it now? `--features hub` build healthy across PRs #1432-#1434 means the test surface is reliable; this is the natural follow-up. 3. Why the symmetry test? `cuda_and_wgpu_fallback_log_prefixes_share_contract_tag` locks in that BOTH backends use the same `[CONTRACT_ID] <backend> path rejected` shape. Without it a future PR could drift one but not the other and grep recipes would silently skip backends. 4. Why keep `[GH-559]` alongside the new contract tag? Runbook continuity — humans tracking that issue tag in logs over time shouldn't lose it. 5. Why no contract version bump? v1.2.0 already specifies this tag in FALSIFY-CPU-GPU-005's prediction; this PR closes the implementation gap. Bumping again would imply a contract semantic change, which isn't happening — only the code catches up to the contract. Verified locally: - `cargo test -p aprender-serve --features cuda --lib --release fallback_log_prefix` → 3/3 pass (cuda + wgpu + symmetry) - `cargo fmt --all -- --check` → no diff in touched file Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

noahgift merged commit 0181abe into main May 3, 2026
11 checks passed

noahgift deleted the feat/falsify-cpu-gpu-005-wgpu-parity-gate-contract branch May 3, 2026 16:03

noahgift mentioned this pull request May 3, 2026

spec(ship-two-models-spec): v2.86.0 — §41 records apr-cpu-vs-gpu-output-parity-v1 chain (PRs #1427-#1430) #1431

Merged

3 tasks

noahgift mentioned this pull request May 3, 2026

test(apr-cpu-vs-gpu-output-parity-v1): add WGPU_FALLBACK_LOG_PREFIX + drift-prevention tests #1435

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

contract(apr-cpu-vs-gpu-output-parity-v1): v1.2.0 — FALSIFY-CPU-GPU-005 wgpu visibility + parity-gate entry#1430

contract(apr-cpu-vs-gpu-output-parity-v1): v1.2.0 — FALSIFY-CPU-GPU-005 wgpu visibility + parity-gate entry#1430
noahgift merged 1 commit into
mainfrom
feat/falsify-cpu-gpu-005-wgpu-parity-gate-contract

noahgift commented May 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

noahgift commented May 3, 2026

Summary

Why this PR

Test plan

Out of scope (follow-up tracked in v1.2.0 algorithm_evidence)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant