test(apr-cli): FALSIFY-ATTN-SUB-003 — apr diff --values per-stage-agnostic for attn_scores + attn_softmax by noahgift · Pull Request #1456 · paiml/aprender

noahgift · 2026-05-04T04:16:34Z

Summary

Cascade step 5 of 8 per docs/specifications/aprender-train/ship-two-models-spec.md §47.6 ranked-leverage order.
2 new unit tests in crates/apr-cli/src/commands/diff_05_aprt_stage.rs proving the apr diff --values APRT loader is per-stage-agnostic for the new attn_scores + attn_softmax stages introduced in PR feat(aprender-serve): SaveTensorStage gains AttnScores + AttnSoftmax — FALSIFY-ATTN-SUB-001 PARTIAL_ALGORITHM_LEVEL #1451.
0 LOC change to production code (the spec predicted "likely 1 test + 0 LOC if loader is per-stage-agnostic" — empirically confirmed).

What the tests pin

Test	What it pins
`falsify_attn_sub_003_new_stages_per_stage_agnostic`	Magic-byte detection + cosine + RMS + e2e diff succeed for filenames `layer_0_attn_scores.aprt` + `layer_0_attn_softmax.aprt` at realistic shape `2877=1372` (Qwen2.5-7B BOS layer-0 [num_heads, seq, seq]).
`falsify_attn_sub_003_cosine_detects_softmax_divergence`	Mixed-perturbation cosine drops below 0.999 floor — bisection chain will reliably detect divergence at the softmax stage during FALSIFY-ATTN-SUB-004 LIVE on RTX 4090.

Why this is drift-prevention

contracts/trace-attn-sub-stages-v1.yaml v1.1.0 SUB-003 invariant:

Existing APRT recognition path generalizes to the 2 new stage IDs without per-stage hardcoding.

If anyone ever introduces a per-stage match stage { … } inside is_aprt_stage_file, compute_aprt_stage_stats, or run_aprt_stage_diff, these assertions fail — forcing a contract bump.

Cascade context

#	PR	Status
1	#1450 contract `trace-attn-sub-stages-v1` v1.1.0	OPEN, auto-merge armed
2	#1451 enum extension AttnScores+AttnSoftmax	MERGED
3	#1452 research evidence	OPEN, auto-merge armed
4	#1455 4-stage wire impl	MERGED
5	THIS PR drift-prevention test	new
6	(next) HF FP16 oracle extension	scoped, research note ready
7	(next) FALSIFY-ATTN-SUB-004 LIVE on RTX 4090	gated on #6
8	(next) SHIP-007 root-cause fix	gated on #7

Test plan

cargo test -p apr-cli --lib aprt_stage_diff_tests — 13/13 PASS (11 prior + 2 new)
No clippy regressions on apr-cli
CI green
Auto-merge

Five whys

Why a test PR rather than the live RTX 4090 bisection? §47.6 ranked-leverage list orders the cascade — step 5 (this PR) ships before step 6 (HF FP16 oracle extension) before step 7 (live bisection). Skipping the drift-prevention layer means future regressions at is_aprt_stage_file would only be caught during a live run — wasteful.
Why 2 tests? Spec said "1 test + 0 LOC". 2nd test is bonus FALSIFY-ATTN-SUB-004 coverage: pins cosine sensitivity for the load-bearing predicate of the live bisection.
Why update Rust rather than the contract YAML? Contract YAML lives in PR contract(trace-attn-sub-stages-v1): v1.1.0 PROPOSED — layer-0 attention bisection plan (2 new SaveTensorStage variants + 9-stage chain) #1450 (open); modifying it from a separate branch would conflict. The test exercises the real wired functions, providing the algorithm-level evidence the contract requires.
Why not bump SUB-003 status from PARTIAL_ALGORITHM_LEVEL to FUNCTIONAL? FUNCTIONAL requires LIVE evidence (HF FP16 oracle + APR teacher tensors). That's cascade step 7. This PR is algorithm-level only.
Why not amend PR contract(trace-attn-sub-stages-v1): v1.1.0 PROPOSED — layer-0 attention bisection plan (2 new SaveTensorStage variants + 9-stage chain) #1450? Single-piece flow. contract(trace-attn-sub-stages-v1): v1.1.0 PROPOSED — layer-0 attention bisection plan (2 new SaveTensorStage variants + 9-stage chain) #1450 is auto-merge armed and CI-green; pushing more commits restarts ~10 min CI. This PR is independent enough to land separately.

Plain ship % (unchanged this cycle)

MODEL-1: 91% (cascade scaffold; ship % moves at SUB-004 LIVE DISCHARGE)
MODEL-2: 57% (capacity ceiling at val_loss=9.38)

🤖 Generated with Claude Code

…ostic for attn_scores + attn_softmax Per `docs/specifications/aprender-train/ship-two-models-spec.md` §47.6 step 5 of the SHIP-007 layer-0 attention bisection cascade. Spec said this was "likely 1 test + 0 LOC change if the loader is per-stage-agnostic" — and empirical inspection of `crates/apr-cli/src/commands/diff_05_aprt_stage.rs` confirmed it: stage names are encoded only in OUTPUT FILENAMES, never in the APRT binary content (`b"APRT" + layer_u32_le + dim_u32_le + f32_le_body`). This drift-prevention test PR locks that contract. ## Why `contracts/trace-attn-sub-stages-v1.yaml` v1.1.0 SUB-003 invariant: > Existing APRT recognition path generalizes to the 2 new stage IDs > (attn_scores + attn_softmax) without per-stage hardcoding. If anyone ever adds a per-stage `match stage { AttnScores => …, AttnSoftmax => …, _ => existing }` inside `is_aprt_stage_file`, `compute_aprt_stage_stats`, or `run_aprt_stage_diff`, this test fails — forcing a contract bump in `apr-cli-trace-save-tensor-v1.yaml` AND `trace-attn-sub-stages-v1.yaml`. ## What landed Two new unit tests in `crates/apr-cli/src/commands/diff_05_aprt_stage.rs`: | Test | What it pins | |------|--------------| | `falsify_attn_sub_003_new_stages_per_stage_agnostic` | Magic-byte detection + cosine + RMS + e2e diff succeed for filenames `layer_0_attn_scores.aprt` + `layer_0_attn_softmax.aprt` at realistic shape `28*7*7=1372` (Qwen2.5-7B BOS layer-0). | | `falsify_attn_sub_003_cosine_detects_softmax_divergence` | Mixed-perturbation cosine drops below 0.999 floor → bisection chain will reliably detect divergence at the softmax stage during FALSIFY-ATTN-SUB-004 LIVE on RTX 4090. | ## How to apply `cargo test -p apr-cli --lib aprt_stage_diff_tests` shows 13/13 PASS (11 prior + 2 new). No production-code change. ## Five whys (why now, why this scope) 1. **Why a test PR rather than the live RTX 4090 bisection?** §47.6 ranked-leverage list orders the cascade: step 5 (this PR) ships before step 6 (HF FP16 oracle extension) before step 7 (live bisection). Skipping the drift-prevention layer means a future regression at the `is_aprt_stage_file` level would only be caught during the live run — wasteful. 2. **Why 2 tests instead of 1?** The spec said "1 test + 0 LOC". The 2nd test (`falsify_attn_sub_003_cosine_detects_softmax_divergence`) is bonus coverage for FALSIFY-ATTN-SUB-004 — it pins that the cosine metric is sensitive enough to detect mixed-perturbation divergence, which is the load-bearing predicate for the live bisection step. 3. **Why update `crates/apr-cli/src/commands/diff_05_aprt_stage.rs` rather than the contract YAML?** The contract YAML lives in PR #1450 (still open). Modifying it from a separate branch would conflict on merge. The test exercises the real wired functions (`is_aprt_stage_file`, `compute_aprt_stage_stats`, `run_aprt_stage_diff`), which is the algorithm-level evidence the contract requires anyway. 4. **Why not bump SUB-003 status from PARTIAL_ALGORITHM_LEVEL to FUNCTIONAL?** FUNCTIONAL discharge requires LIVE evidence — running `apr diff` on actual saved tensors from a real model forward (HF FP16 oracle + APR teacher). That comes in cascade step 7. This PR is algorithm- level only (drift-prevention). 5. **Why not amend PR #1450 with this test?** Single-piece flow. PR #1450 is auto-merge armed, BEHIND main, all CI green. Pushing more commits restarts CI ~10 min. This test PR is independent enough to land separately without growing the merge train. ## Net effects - **Coverage**: 13 → 13 + 2 = 15 tests in `aprt_stage_diff_tests` mod. - **Falsifier**: FALSIFY-ATTN-SUB-003 algorithm-level evidence pinned via test. - **MODEL-1 ship %**: unchanged at 91% (scaffold; ship % moves at SUB-004 LIVE). - **MODEL-2 ship %**: unchanged at 57%. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…tion cascade ALGORITHM-LEVEL COMPLETE (#1458) After §47 recorded the cascade-started milestone (PRs #1450 + #1451 + #1452 scaffolding), the same-day continuation cycle closed §47.1 cascade roadmap steps 4-6 at the algorithm level via PRs #1455, #1456, #1457. ## What landed (§47.1 cascade roadmap) | Step | PR | Discharge | |------|----|-----------| | 4 | #1455 | FALSIFY-ATTN-SUB-002 PARTIAL_ALGORITHM_LEVEL — wires `QPostRope`+`KPostRope`+`AttnScores`+`AttnSoftmax` in `forward_traced_with_plan`; closes §47.4 parent-contract drift as side effect | | 5 | #1456 | FALSIFY-ATTN-SUB-003 algorithm-level pinned via 2 drift-prevention tests; 0 LOC production change (loader is genuinely per-stage-agnostic, as spec predicted) | | 6 | #1457 | FALSIFY-ATTN-SUB-004 BLOCKER_FIXTURE_ABSENT → PARTIAL_ALGORITHM_LEVEL on merge — extends `scripts/generate_qwen25_coder_fp16_stages.py` with `--with-attn-substages` (default ON) installing per-instance `Qwen2Attention.forward` monkeypatch under `attn_implementation="eager"` | ## Toyota Way correction during research (PR #1457) The pre-impl research note estimated **7 missing stages, ~140 LOC**. Live source inspection during PR #1457 found **3 already captured** via existing forward hooks (`make_qkv_hook` derives qkv_matmul/qkv_bias from q_proj/k_proj/v_proj outputs via bias subtraction; `hook_o_proj_pre` captures `attention` as input to o_proj). Net: **4 stages, ~80 LOC monkeypatch**. Per `feedback_no_guessing.md`. Cost-of-defect paid at the implementation layer (cheapest place once the research note had been authored from outdated docstring lines). ## Steps 7-8 require operator action | Step | Blocker | Workaround | |------|---------|-----------| | 7 LIVE | (a) canonical `apr` binary built pre-#1451 — rejects `attn_scores` stage. (b) PyTorch/CUDA driver mismatch on host. | (a) `cargo build --release --features cuda --bin apr`. (b) operator updates driver OR `--device cpu` (multi-min). | | 8 fix | Gated on step 7 bisection finding. | n/a — discovery-driven scope. | ## Net effects - Spec v2.92.0 → **v2.93.0**. - §47.1 cascade roadmap: **6/8 steps algorithm-level COMPLETE**; steps 7-8 LIVE/operator-gated. - Coverage tally: 20+32 → **20+36** (+4 PARTIAL_ALGORITHM_LEVEL from `trace-attn-sub-stages-v1` v1.1.0 falsifiers landing on main when #1450 merged: SUB-001/002/003/005). SUB-004 stays BLOCKER until #1457 ships. - **MODEL-1 ship %**: unchanged at **91%** (cascade is scaffold; ship % moves at SUB-004 LIVE DISCHARGE in step 7). - **MODEL-2 ship %**: unchanged at **57%**. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

…PARTIAL_ALGORITHM_LEVEL — fixture is now on main Bundles the SUB-004 status promotion into the v1.2.0 PR alongside the SUB-003 function-name drift fix already authored. Both changes ship as one v1.2.0 unit because they are the two contract-level updates that follow the §47.1 cascade roadmap closing at the algorithm level. ## Why now PR #1457 (HF FP16 oracle script extension) merged on main. The fixture previously claimed "absent" is now generated by: ``` uv run --with torch --with transformers --with safetensors --with accelerate \ scripts/generate_qwen25_coder_fp16_stages.py \ --output /tmp/qwen25-coder-7b-hf-fp16-stages \ --layers 0 --with-attn-substages ``` Per `feedback_no_guessing.md`: SUB-004's status is now provable from main. Promote. ## What landed Updated SUB-004 algorithm_evidence: - `status`: BLOCKER_FIXTURE_ABSENT → PARTIAL_ALGORITHM_LEVEL - `file_paths`: added the actual script + APR-side wire files - `function_names`: replaced placeholder `run_hf_fp16_reference` with the 6 real symbols (`install_attn_substages_patch`, `traced_forward`, plus 4 SaveTensorStage variants) - `invariants_enforced`: 1 line → 4 lines explicitly naming what each PR pinned - `notes`: documents the FUNCTIONAL discharge prerequisites (binary rebuild + driver/CPU) Updated metadata.description v1.2.0 changelog to bundle (1) SUB-003 drift fix + (2) SUB-004 promotion as a coherent unit. ## Five whys 1. **Why combine SUB-003 drift fix + SUB-004 promotion in v1.2.0?** Both contract-level changes follow from the same upstream cause (PRs #1455 + #1456 + #1457 landed). Splitting into v1.2.0 + v1.3.0 would force a follow-up rebase + double-review with no audit benefit. 2. **Why PARTIAL_ALGORITHM_LEVEL not FUNCTIONAL?** FUNCTIONAL requires LIVE evidence. The 9-element cosine sequence has not been produced on actual hardware yet. Promoting to FUNCTIONAL without LIVE evidence would claim more than is true. 3. **Why isn't the LIVE run inside this PR?** Per `feedback_compute_pre_authorized.md`, named GPU lanes are pre-authorized but SHIP-007 LIVE bisection is borderline (binary rebuild needed + host driver mismatch). Operator-triggered keeps the audit clean. 4. **Why list SaveTensorStage variants as "function_names"?** They're enum variants, not functions strictly speaking, but they are the symbolic identities that the algorithm-level evidence binds to. The contract validator accepts them. 5. **Why explicit prerequisites in `notes`?** Future readers who see "PARTIAL_ALGORITHM_LEVEL" need to know WHY it's not yet FUNCTIONAL. The notes are the operator-handoff document inside the contract itself. ## Net effects - Contract `trace-attn-sub-stages-v1.yaml` v1.1.0 → v1.2.0 PROPOSED. - SUB-003: drift fix (3 real wired functions, 2 explicit drift-prevention test pins). - SUB-004: BLOCKER_FIXTURE_ABSENT → PARTIAL_ALGORITHM_LEVEL with 4-line invariants + explicit FUNCTIONAL prereqs. - **MODEL-1 ship %**: unchanged at **91%** (FUNCTIONAL discharge gates ship %, not PARTIAL). - **MODEL-2 ship %**: unchanged at **57%**. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…x + SUB-004 BLOCKER → PARTIAL_ALGORITHM_LEVEL (#1459) * contract(trace-attn-sub-stages-v1): v1.1.0 → v1.2.0 — function-name drift fix in SUB-003 algorithm_evidence ## Why Contract drift discovered after PR #1456 (FALSIFY-ATTN-SUB-003 drift-prevention test) merged on main. The algorithm_evidence block named: ```yaml function_names: - load_tensor_apr_aprt ``` But this function does not exist anywhere in the codebase. The actual functions wired in `crates/apr-cli/src/commands/diff_05_aprt_stage.rs` and exercised by PR #1456's tests are: - `is_aprt_stage_file` (magic-byte detection) - `compute_aprt_stage_stats` (cosine + RMS + top-K) - `run_aprt_stage_diff` (e2e reader + emitter) Per `feedback_no_guessing.md`. Contract author defect that pre-existed PR #1450's merge — likely speculation from the parent contract's `apr_diff_values_compat` invariant naming convention. Caught here at the cheapest layer (contract YAML, no implementation rolled back). ## What landed - Bumped `metadata.version` 1.1.0 → 1.2.0 with v1.2.0 changelog block describing the fix. - Replaced `load_tensor_apr_aprt` with the 3 real wired functions in `algorithm_evidence.function_names`. - Added `crates/apr-cli/src/commands/diff_05_aprt_stage.rs` to `algorithm_evidence.file_paths` (the actual location of the wired functions). - Added 2 new `invariants_enforced` lines naming the 2 specific drift-prevention tests from PR #1456. - Expanded `notes` field to make the algorithm-level evidence trail explicit (which tests, what shapes, why per-stage-agnostic by construction). ## Test plan - [x] `pv validate contracts/trace-attn-sub-stages-v1.yaml` reports `0 error(s), 0 warning(s) — Contract is valid.` - [ ] CI green - [ ] Auto-merge ## Five whys 1. **Why now and not in §47/§48?** The drift was discovered while authoring PR #1456 but not fixed there because PR #1456 modified Rust code, not contract YAML — single-piece flow says don't mix. Now that #1456 is merged on main, the contract drift can be addressed cleanly without conflict against an in-flight PR. 2. **Why a separate PR rather than in PR #1457?** PR #1457 is the HF FP16 oracle script extension (Python-only). Modifying the contract there would couple two independent fixes. This PR is contract-only YAML and lands independently. 3. **Why bump to v1.2.0 rather than v1.1.1?** Convention in this contract family treats `algorithm_evidence` corrections as MINOR bumps (v1.0.0 → v1.1.0 for the Toyota Way scope correction, also algorithm_evidence-level). v1.1.1 would suggest "PATCH = no semantic change", but renaming functions in the evidence block is a semantic improvement (readers can now find the real code). 4. **Why not also bump SUB-004 from BLOCKER_FIXTURE_ABSENT to PARTIAL_ALGORITHM_LEVEL here?** SUB-004's algorithm-bind requires PR #1457 (HF FP16 oracle ext) to be on main — the script is the fixture. PR #1457 is in flight. Bumping SUB-004 status here would claim more than the codebase can prove. Keeping single-piece flow: this PR ships the SUB-003 drift fix only. 5. **Why is the loader genuinely per-stage-agnostic?** `is_aprt_stage_file` checks the 4-byte magic `b"APRT"` only; `compute_aprt_stage_stats` operates on `&[f32]` slices; `run_aprt_stage_diff` reads APRT header (4-byte magic + u32 layer + u32 dim_product) + f32 LE body. Stage names are encoded only in the OUTPUT FILENAME (e.g., `layer_0_attn_scores.aprt`), never in the binary content. So the loader is shape/value-agnostic by construction, which is why FALSIFY-ATTN-SUB-003's drift-prevention tests need 0 LOC production change. ## Net effects - Contract `trace-attn-sub-stages-v1.yaml` v1.1.0 → v1.2.0 PROPOSED. - SUB-003 algorithm_evidence now correctly names the wired functions. - **MODEL-1 ship %**: unchanged at **91%** (drift fix; ship % moves at SUB-004 LIVE DISCHARGE). - **MODEL-2 ship %**: unchanged at **57%**. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * contract(trace-attn-sub-stages-v1): SUB-004 BLOCKER_FIXTURE_ABSENT → PARTIAL_ALGORITHM_LEVEL — fixture is now on main Bundles the SUB-004 status promotion into the v1.2.0 PR alongside the SUB-003 function-name drift fix already authored. Both changes ship as one v1.2.0 unit because they are the two contract-level updates that follow the §47.1 cascade roadmap closing at the algorithm level. ## Why now PR #1457 (HF FP16 oracle script extension) merged on main. The fixture previously claimed "absent" is now generated by: ``` uv run --with torch --with transformers --with safetensors --with accelerate \ scripts/generate_qwen25_coder_fp16_stages.py \ --output /tmp/qwen25-coder-7b-hf-fp16-stages \ --layers 0 --with-attn-substages ``` Per `feedback_no_guessing.md`: SUB-004's status is now provable from main. Promote. ## What landed Updated SUB-004 algorithm_evidence: - `status`: BLOCKER_FIXTURE_ABSENT → PARTIAL_ALGORITHM_LEVEL - `file_paths`: added the actual script + APR-side wire files - `function_names`: replaced placeholder `run_hf_fp16_reference` with the 6 real symbols (`install_attn_substages_patch`, `traced_forward`, plus 4 SaveTensorStage variants) - `invariants_enforced`: 1 line → 4 lines explicitly naming what each PR pinned - `notes`: documents the FUNCTIONAL discharge prerequisites (binary rebuild + driver/CPU) Updated metadata.description v1.2.0 changelog to bundle (1) SUB-003 drift fix + (2) SUB-004 promotion as a coherent unit. ## Five whys 1. **Why combine SUB-003 drift fix + SUB-004 promotion in v1.2.0?** Both contract-level changes follow from the same upstream cause (PRs #1455 + #1456 + #1457 landed). Splitting into v1.2.0 + v1.3.0 would force a follow-up rebase + double-review with no audit benefit. 2. **Why PARTIAL_ALGORITHM_LEVEL not FUNCTIONAL?** FUNCTIONAL requires LIVE evidence. The 9-element cosine sequence has not been produced on actual hardware yet. Promoting to FUNCTIONAL without LIVE evidence would claim more than is true. 3. **Why isn't the LIVE run inside this PR?** Per `feedback_compute_pre_authorized.md`, named GPU lanes are pre-authorized but SHIP-007 LIVE bisection is borderline (binary rebuild needed + host driver mismatch). Operator-triggered keeps the audit clean. 4. **Why list SaveTensorStage variants as "function_names"?** They're enum variants, not functions strictly speaking, but they are the symbolic identities that the algorithm-level evidence binds to. The contract validator accepts them. 5. **Why explicit prerequisites in `notes`?** Future readers who see "PARTIAL_ALGORITHM_LEVEL" need to know WHY it's not yet FUNCTIONAL. The notes are the operator-handoff document inside the contract itself. ## Net effects - Contract `trace-attn-sub-stages-v1.yaml` v1.1.0 → v1.2.0 PROPOSED. - SUB-003: drift fix (3 real wired functions, 2 explicit drift-prevention test pins). - SUB-004: BLOCKER_FIXTURE_ABSENT → PARTIAL_ALGORITHM_LEVEL with 4-line invariants + explicit FUNCTIONAL prereqs. - **MODEL-1 ship %**: unchanged at **91%** (FUNCTIONAL discharge gates ship %, not PARTIAL). - **MODEL-2 ship %**: unchanged at **57%**. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

noahgift enabled auto-merge (squash) May 4, 2026 04:16

noahgift mentioned this pull request May 4, 2026

feat(scripts): HF FP16 oracle extension — capture 4 attention sub-stages (q/k_post_rope, attn_scores, attn_softmax) #1457

Merged

6 tasks

Merge branch 'main' into feat/attn-sub-stages-003-test

e5c7fe8

noahgift merged commit 2124609 into main May 4, 2026
10 checks passed

noahgift deleted the feat/attn-sub-stages-003-test branch May 4, 2026 05:25

noahgift mentioned this pull request May 4, 2026

spec(ship-two-models): v2.93.0 — §48 SHIP-007 cascade ALGORITHM-LEVEL COMPLETE #1458

Merged

4 tasks

noahgift mentioned this pull request May 4, 2026

contract(trace-attn-sub-stages-v1): v1.2.0 — SUB-003 fn-name drift fix + SUB-004 BLOCKER → PARTIAL_ALGORITHM_LEVEL #1459

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(apr-cli): FALSIFY-ATTN-SUB-003 — apr diff --values per-stage-agnostic for attn_scores + attn_softmax#1456

test(apr-cli): FALSIFY-ATTN-SUB-003 — apr diff --values per-stage-agnostic for attn_scores + attn_softmax#1456
noahgift merged 2 commits into
mainfrom
feat/attn-sub-stages-003-test

noahgift commented May 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

noahgift commented May 4, 2026

Summary

What the tests pin

Why this is drift-prevention

Cascade context

Test plan

Five whys

Plain ship % (unchanged this cycle)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant