test(apr-cpu-vs-gpu-output-parity-v1): drift-prevention for CUDA fallback log tag#1429
Merged
Merged
Conversation
…back log tag Promotes the FALSIFY-CPU-GPU-003 jidoka eprintln tag to a `pub(crate) const` and adds a unit test asserting the prefix shape. Locks in PR #1428 against two regression classes: 1. Renaming the contract tag without bumping `apr-cpu-vs-gpu-output-parity-v1` in lockstep 2. Re-wrapping the eprintln in `if verbose { ... }` (which would re-introduce the silent-gibberish behaviour v6 fixed). Five Whys: 1. Why this test? Because the v1.1.0 contract requires "stderr grep tag visibility" but a future refactor could quietly delete or rename the tag. 2. Why a string-literal const + assert vs a full integration test? The full test would need a real CUDA GPU + a model that fails parity, which can't run in CI deterministically. A const-shape test catches the regression class at compile/test time without GPU. 3. Why two assertions (starts_with + contains)? `starts_with` locks the contract ID prefix for greppability; `contains("CUDA path rejected")` locks the human-readable backend name so users still understand the message even if the contract ID changes between major versions. Verified locally: - cargo build -p aprender-serve --features cuda --release → clean - cargo test -p aprender-serve --features cuda --lib cuda_fallback_log_prefix_is_contract_tagged → 1 passed Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
4 tasks
noahgift
added a commit
that referenced
this pull request
May 3, 2026
…CPU-GPU-005 wgpu visibility + parity-gate entry (#1430) Closes the second half of the silent-fallback loophole. After PR #1428 made CUDA rejection visible (FALSIFY-CPU-GPU-003), the wgpu fallback path still ships gibberish silently for the canonical 7B teacher because (a) its init and "Backend: wgpu (Vulkan)" logs are verbose-gated and (b) it has no parity_gate analog to CUDA's. This PR lands the (a) visibility fix immediately and binds the (b) parity- gate at PARTIAL_ALGORITHM_LEVEL pending a follow-up implementation (~100-150 LOC, requires extracting the per-token wgpu decode loop body into a callable single-step function). Five Whys: 1. Why ship visibility before the parity gate? Visibility is one-line, low- risk, and immediately useful — users now see "Backend: wgpu (Vulkan)" on stderr without --verbose, so they know which backend is serving their tokens after CUDA falls through. 2. Why not full gate now? wgpu's existing API doesn't expose a single-step forward; adding one means refactoring the autoregressive loop body. Doable but bigger PR — keep this one bounded. 3. Why bump v1.1.0 → v1.2.0 not v2.0.0? FALSIFY-CPU-GPU-005 is additive; no existing falsifier semantics changed. Minor bump per semver. Code: gguf_gpu_generate.rs:23-32 (try_wgpu_generate) and 311-326 (try_apr_wgpu_inference) drop `if verbose { ... }` from the wgpu init/ Backend log lines. Verification: - pv validate contracts/apr-cpu-vs-gpu-output-parity-v1.yaml → 0 errors - cargo build -p aprender-serve --features cuda --release → clean - cargo test -p aprender-serve --features cuda --lib cuda_fallback_log_prefix_is_contract_tagged → 1 passed (existing drift-prevention from PR #1429 still green) Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
3 tasks
noahgift
added a commit
that referenced
this pull request
May 3, 2026
… drift-prevention tests (#1435) Closes the contract drift between v1.2.0's prediction and the actual code: contract `apr-cpu-vs-gpu-output-parity-v1` v1.2.0 (PR #1430) said the wgpu rejection log should emit `[apr-cpu-vs-gpu-output-parity-v1] wgpu path rejected, attempting fallback: ...` symmetric to the CUDA tag, but #1430 only made the existing `[GH-559]`/`Backend:` logs unconditional — the contract-tagged wgpu rejection log itself was missing. Mirror of the FALSIFY-CPU-GPU-003 chain (#1428 visibility + #1429 drift test) for FALSIFY-CPU-GPU-005: - Adds `pub(crate) const WGPU_FALLBACK_LOG_PREFIX = "[apr-cpu-vs-gpu-output-parity-v1] wgpu path rejected"` - Updates `try_apr_wgpu_inference` to emit it on `GpuDevice::new()` failure (alongside the existing `[GH-559]` runbook tag — both, not either) - 3 new unit tests: * `wgpu_fallback_log_prefix_is_contract_tagged` (symmetric to the CUDA test) * `cuda_and_wgpu_fallback_log_prefixes_share_contract_tag` (symmetry guard: both prefixes must start with the same contract ID and end with "path rejected" so grep recipes work uniformly across backends) Five Whys: 1. Why was this gap in v1.2.0? PR #1430 conflated "make existing wgpu log visible" (done) with "add contract-tagged rejection log" (deferred). The contract's prediction text wrote the second; the code shipped only the first. 2. Why catch it now? `--features hub` build healthy across PRs #1432-#1434 means the test surface is reliable; this is the natural follow-up. 3. Why the symmetry test? `cuda_and_wgpu_fallback_log_prefixes_share_contract_tag` locks in that BOTH backends use the same `[CONTRACT_ID] <backend> path rejected` shape. Without it a future PR could drift one but not the other and grep recipes would silently skip backends. 4. Why keep `[GH-559]` alongside the new contract tag? Runbook continuity — humans tracking that issue tag in logs over time shouldn't lose it. 5. Why no contract version bump? v1.2.0 already specifies this tag in FALSIFY-CPU-GPU-005's prediction; this PR closes the implementation gap. Bumping again would imply a contract semantic change, which isn't happening — only the code catches up to the contract. Verified locally: - `cargo test -p aprender-serve --features cuda --lib --release fallback_log_prefix` → 3/3 pass (cuda + wgpu + symmetry) - `cargo fmt --all -- --check` → no diff in touched file Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
if verbose.pub(crate) const CUDA_FALLBACK_LOG_PREFIXingguf_gpu_generate.rs.cuda_fallback_log_prefix_is_contract_tagged(no GPU required).Why this PR
Per the v1.1.0 contract
apr-cpu-vs-gpu-output-parity-v1: drift-prevention test should grep stderr for the contract tag. A full integration test would need a real CUDA GPU + a deliberately broken model — not viable in CI. A const-shape unit test catches the regression class at compile/test time.Test plan
cargo build -p aprender-serve --features cuda --release— cleancargo test -p aprender-serve --features cuda --lib cuda_fallback_log_prefix_is_contract_tagged— 1 passed🤖 Generated with Claude Code