Skip to content

contract(apr-cpu-vs-gpu-output-parity-v1): v1.2.0 — FALSIFY-CPU-GPU-005 wgpu visibility + parity-gate entry#1430

Merged
noahgift merged 1 commit into
mainfrom
feat/falsify-cpu-gpu-005-wgpu-parity-gate-contract
May 3, 2026
Merged

contract(apr-cpu-vs-gpu-output-parity-v1): v1.2.0 — FALSIFY-CPU-GPU-005 wgpu visibility + parity-gate entry#1430
noahgift merged 1 commit into
mainfrom
feat/falsify-cpu-gpu-005-wgpu-parity-gate-contract

Conversation

@noahgift

@noahgift noahgift commented May 3, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Adds FALSIFY-CPU-GPU-005 to apr-cpu-vs-gpu-output-parity-v1 at PARTIAL_ALGORITHM_LEVEL: wgpu fallback must (a) log lifecycle visibly without --verbose AND (b) run a parity gate analogous to CUDA's.
  • Lands (a) wgpu visibility fix immediately — symmetric to fix(apr-cpu-vs-gpu-output-parity-v1): make CUDA fallback decision visible without --verbose (v1.0→v1.1, PROPOSED→ACTIVE) #1428's CUDA fix. Users will now see Backend: wgpu (Vulkan) and [GH-559] wgpu init failed: ... on stderr without --verbose.
  • Defers (b) wgpu cosine-similarity gate to follow-up PR (~100-150 LOC, needs a small refactor to expose wgpu single-step decode).
  • Bumps contract v1.1.0 → v1.2.0 (additive; no existing falsifier semantics changed).

Why this PR

Per Toyota Way + the v1.1.0 contract follow-up I flagged: when the CUDA path is rejected, fall-through hits wgpu, which today produces independent gibberish on the canonical 7B teacher. PR #1428 made CUDA's rejection visible but wgpu still serves silent garbage. This PR closes the visibility half of the loophole + binds the parity-gate half so it doesn't get lost.

Test plan

  • pv validate contracts/apr-cpu-vs-gpu-output-parity-v1.yaml — 0 errors
  • cargo build -p aprender-serve --features cuda --release — clean
  • cargo test -p aprender-serve --features cuda --lib cuda_fallback_log_prefix_is_contract_tagged — 1 passed (drift-prevention from test(apr-cpu-vs-gpu-output-parity-v1): drift-prevention for CUDA fallback log tag #1429 still green)
  • Live smoke after release rebuild on lambda-labs canonical 7B teacher: stderr should now include [apr-cpu-vs-gpu-output-parity-v1] CUDA path rejected, ... AND Backend: wgpu (Vulkan) without -v

Out of scope (follow-up tracked in v1.2.0 algorithm_evidence)

  • Implement the wgpu cosine gate at try_apr_wgpu_inference's init point — extract per-token decode body into a callable, run one CPU+wgpu BOS forward, cosine-compare logits, return None on < 0.99.

🤖 Generated with Claude Code

…CPU-GPU-005 wgpu visibility + parity-gate entry

Closes the second half of the silent-fallback loophole. After PR #1428 made
CUDA rejection visible (FALSIFY-CPU-GPU-003), the wgpu fallback path still
ships gibberish silently for the canonical 7B teacher because (a) its init
and "Backend: wgpu (Vulkan)" logs are verbose-gated and (b) it has no
parity_gate analog to CUDA's.

This PR lands the (a) visibility fix immediately and binds the (b) parity-
gate at PARTIAL_ALGORITHM_LEVEL pending a follow-up implementation
(~100-150 LOC, requires extracting the per-token wgpu decode loop body
into a callable single-step function).

Five Whys:
1. Why ship visibility before the parity gate? Visibility is one-line, low-
   risk, and immediately useful — users now see "Backend: wgpu (Vulkan)"
   on stderr without --verbose, so they know which backend is serving
   their tokens after CUDA falls through.
2. Why not full gate now? wgpu's existing API doesn't expose a single-step
   forward; adding one means refactoring the autoregressive loop body.
   Doable but bigger PR — keep this one bounded.
3. Why bump v1.1.0 → v1.2.0 not v2.0.0? FALSIFY-CPU-GPU-005 is additive;
   no existing falsifier semantics changed. Minor bump per semver.

Code: gguf_gpu_generate.rs:23-32 (try_wgpu_generate) and 311-326
(try_apr_wgpu_inference) drop `if verbose { ... }` from the wgpu init/
Backend log lines.

Verification:
- pv validate contracts/apr-cpu-vs-gpu-output-parity-v1.yaml → 0 errors
- cargo build -p aprender-serve --features cuda --release → clean
- cargo test -p aprender-serve --features cuda --lib
    cuda_fallback_log_prefix_is_contract_tagged → 1 passed (existing
    drift-prevention from PR #1429 still green)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@noahgift noahgift merged commit 0181abe into main May 3, 2026
11 checks passed
@noahgift noahgift deleted the feat/falsify-cpu-gpu-005-wgpu-parity-gate-contract branch May 3, 2026 16:03
noahgift added a commit that referenced this pull request May 3, 2026
…vs-gpu-output-parity-v1` chain (PRs #1427-#1430) (#1431)

Canonical record of today's 4-PR session that lands a 3-layer jidoka armor
at the GPU-CPU dispatch boundary, closing §40's silent-gibberish loophole
*as a regression class* (separate from the underlying GPU kernel fix).

Net effect: default `apr run <model.apr>` on a SHIP-007-broken GPU build
now emits the full backend-fallback chain on stderr without `--verbose`:

  [apr-cpu-vs-gpu-output-parity-v1] CUDA path rejected, attempting fallback: ...
  Backend: wgpu (Vulkan)
  ...

so users always know which backend is actually serving their tokens — the
`--no-gpu` documented workaround is now self-evidently the correct path on
this build. MODEL-1 ship % nudges 80% → 87% because shipping `apr run` users
with the documented workaround is now jidoka-safe.

This section explicitly does NOT claim the SHIP-007 kernel bug is fixed —
that remains an open track per §40. What §41 codifies is that the failure
mode is now LOUD instead of SILENT.

Five Whys + table of all 4 PRs + next-session pickup list (FALSIFY-CPU-GPU-005
part b OR MODEL-2 distill-train scaffolding) included.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
noahgift added a commit that referenced this pull request May 3, 2026
… drift-prevention tests (#1435)

Closes the contract drift between v1.2.0's prediction and the actual code:
contract `apr-cpu-vs-gpu-output-parity-v1` v1.2.0 (PR #1430) said the wgpu
rejection log should emit `[apr-cpu-vs-gpu-output-parity-v1] wgpu path
rejected, attempting fallback: ...` symmetric to the CUDA tag, but #1430
only made the existing `[GH-559]`/`Backend:` logs unconditional — the
contract-tagged wgpu rejection log itself was missing.

Mirror of the FALSIFY-CPU-GPU-003 chain (#1428 visibility + #1429 drift
test) for FALSIFY-CPU-GPU-005:
- Adds `pub(crate) const WGPU_FALLBACK_LOG_PREFIX = "[apr-cpu-vs-gpu-output-parity-v1] wgpu path rejected"`
- Updates `try_apr_wgpu_inference` to emit it on `GpuDevice::new()` failure
  (alongside the existing `[GH-559]` runbook tag — both, not either)
- 3 new unit tests:
  * `wgpu_fallback_log_prefix_is_contract_tagged` (symmetric to the CUDA test)
  * `cuda_and_wgpu_fallback_log_prefixes_share_contract_tag` (symmetry guard:
    both prefixes must start with the same contract ID and end with
    "path rejected" so grep recipes work uniformly across backends)

Five Whys:
1. Why was this gap in v1.2.0? PR #1430 conflated "make existing wgpu log
   visible" (done) with "add contract-tagged rejection log" (deferred). The
   contract's prediction text wrote the second; the code shipped only the
   first.
2. Why catch it now? `--features hub` build healthy across PRs #1432-#1434
   means the test surface is reliable; this is the natural follow-up.
3. Why the symmetry test? `cuda_and_wgpu_fallback_log_prefixes_share_contract_tag`
   locks in that BOTH backends use the same `[CONTRACT_ID] <backend> path
   rejected` shape. Without it a future PR could drift one but not the
   other and grep recipes would silently skip backends.
4. Why keep `[GH-559]` alongside the new contract tag? Runbook continuity
   — humans tracking that issue tag in logs over time shouldn't lose it.
5. Why no contract version bump? v1.2.0 already specifies this tag in
   FALSIFY-CPU-GPU-005's prediction; this PR closes the implementation
   gap. Bumping again would imply a contract semantic change, which
   isn't happening — only the code catches up to the contract.

Verified locally:
- `cargo test -p aprender-serve --features cuda --lib --release fallback_log_prefix`
    → 3/3 pass (cuda + wgpu + symmetry)
- `cargo fmt --all -- --check` → no diff in touched file

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant