falsify(apr-cli-distill-train-v1): TRAIN-005 PARTIAL_ALGORITHM_LEVEL — precompute byte-determinism by noahgift · Pull Request #1438 · paiml/aprender

noahgift · 2026-05-03T19:43:48Z

Summary

Closes contract drift between task #195 (claimed FALSIFY-APR-DISTILL-TRAIN-005 PARTIAL_ALGORITHM_LEVEL on 2026-04-30) and the YAML which had no algorithm_evidence block. Adds 2 unit tests + contract amendment.

Contract: apr-cli-distill-train-v1.yaml → TRAIN-005 gains algorithm_evidence (status PARTIAL_ALGORITHM_LEVEL, last_verified 2026-05-03).

Falsifier tests (both pass)

falsify_apr_distill_train_005_precompute_is_byte_deterministic — local-teacher branch: 2 precompute runs against an identical fake teacher dir produce byte-identical manifest.json.
falsify_apr_distill_train_005_precompute_remote_teacher_stub_is_deterministic — remote-stub branch: 2 runs against an unresolved HF model_id produce byte-identical pending_download manifest.

Five Whys

Why bind TRAIN-005 now? Per spec §42.7 next-session pickup (b); task apr tensors displays only 100 tensors from APR v2 file containing 291 #195 claimed PARTIAL but the YAML had no algorithm_evidence — contract drift to close.
Why algorithm-bind, not full discharge? run_config_precompute is a stub today (no real teacher forward). DISCHARGED requires the missing real-training implementation per §35; separate larger PR per Toyota Way.
Why two tests, not one? Local-teacher and remote-stub take different code paths (inspect_dir_files vs pending_download). Both must be deterministic.
Why test the manifest bytes, not logits? Current impl emits NO logits. The manifest IS the only deterministic output today. When real-forward is added, the test extends to assert byte-identical logits files (DISCHARGED gate).
Why bounded? ~70 LOC test scaffolding + 13 LOC contract amendment. No production code change. Coverage uplift only.

Net effect on shipping

Coverage tally: 15+33 → 15+34 (+1 PARTIAL_ALGORITHM_LEVEL closed)
MODEL-2 ship %: 54% → 55%
Contract drift: closed between task list and YAML
pv validate: exits 0

Test plan

cargo test -p apr-cli --lib falsify_apr_distill_train_005 (2 pass)
pv validate contracts/apr-cli-distill-train-v1.yaml exit 0
CI green on required gates

🤖 Generated with Claude Code

…— precompute byte-determinism Adds 2 unit tests in distill_include_01.rs::tests that algorithm-bind FALSIFY-APR-DISTILL-TRAIN-005 (precompute is byte-deterministic): - falsify_apr_distill_train_005_precompute_is_byte_deterministic: local-teacher branch, two precompute runs over identical fake teacher dir produce byte-identical manifest.json. - falsify_apr_distill_train_005_precompute_remote_teacher_stub_is_deterministic: remote-stub branch, two runs against unresolved HF model_id produce byte-identical pending_download manifest.json. Contract apr-cli-distill-train-v1.yaml: TRAIN-005 gains algorithm_evidence (status: PARTIAL_ALGORITHM_LEVEL, last_verified 2026-05-03, two test_locations + notes documenting that DISCHARGED requires real teacher forward with logits-on-disk). Five Whys 1. Why bind TRAIN-005 now? Per spec §42.7 next-session pickup (b); TRAIN-005 was claimed PARTIAL via task #195 but the YAML had no algorithm_evidence — this is contract drift to close. 2. Why algorithm-bind, not full discharge? run_config_precompute is currently a stub (writes a manifest, no real teacher forward). Discharging requires the missing real-training implementation per §35; that's a separate, larger PR. Per Toyota Way, focused PRs. 3. Why two tests, not one? The local-teacher and remote-stub branches take different code paths (inspect_dir_files vs pending_download stub). Both must be deterministic. 4. Why test the manifest bytes, not just diff some logits? The current impl emits NO logits — it's a stub. The manifest IS the only deterministic output today. When real-forward is added, the test extends to assert byte-identical logits files alongside the manifest (DISCHARGED gate). 5. Why bounded? ~70 LOC test scaffolding + 13 LOC contract amendment. No production code change. Coverage uplift only. Net effect - Coverage tally: 15+33 → 15+34 (+1 PARTIAL_ALGORITHM_LEVEL). - MODEL-2 ship %: 54% → 55% (one more falsifier locked in). - Contract drift between task list (#195) and YAML closed. - pv validate exits 0. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…erminism

…— train cache-resume idempotency (#1439) Adds 2 unit tests in distill_include_01.rs::tests that algorithm-bind FALSIFY-APR-DISTILL-TRAIN-006 (stage train can resume from precompute cache): - falsify_apr_distill_train_006_train_errors_without_precompute_cache: negative half — stage train MUST error when manifest.json is absent; asserts CliError::ValidationFailed with "Precompute" in message. - falsify_apr_distill_train_006_train_does_not_error_when_cache_present: positive half — after precompute drops manifest.json, stage train MUST NOT error with the cache-missing message (proves the manifest is actually consulted, not just stat-checked). Contract apr-cli-distill-train-v1.yaml: TRAIN-006 gains algorithm_evidence (status: PARTIAL_ALGORITHM_LEVEL, last_verified 2026-05-03, two test_locations + notes documenting that DISCHARGED requires real teacher forward + real student forward that actually loads logits-on-disk and compares to a baseline that re-ran precompute proving no recomputation happened). Five Whys 1. Why bind TRAIN-006 now? Spec §42.7 (b) MODEL-2 distill-train track; task #196 claimed PARTIAL on 2026-04-30 but the YAML had no algorithm_evidence — same pattern of contract drift as TRAIN-005 (PR #1438). 2. Why two halves, not one? The cache-resume invariant has two failure modes: (a) train silently skips manifest check and runs anyway, (b) train ignores manifest after seeing it. Both must be tested for the gate to be meaningful. 3. Why test the error message specifically? Per feedback_apr_trace_not_eprintln, surfacing the "Precompute" keyword in the error message is the user-facing contract — if it regresses to "missing file" with no remediation hint, users won't know what to run next. 4. Why not test logits content equivalence? Real logits-on-disk require real teacher forward, which is the §35 missing real-training implementation. Algorithm-level discharge holds until that lands. 5. Why bounded? ~80 LOC test scaffolding + 14 LOC contract amendment. No production code change. Coverage uplift only. Net effect - Coverage tally: 15+34 → 15+35 (+1 PARTIAL_ALGORITHM_LEVEL). - MODEL-2 ship %: 55% → 56% (cache-resume idempotency locked in). - Stacks on PR #1438 (TRAIN-005); no merge conflict expected. - pv validate exits 0. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

…erminism

…osine helper for FALSIFY-CPU-GPU-005 part b (#1441) Canonical record of today's split-track cycle (PRs #1438-#1440). Maintains the §41/§42 amendment cadence — each /loop iteration that lands ≥3 PRs gets a single audit story. Chain landed: - #1438: FALSIFY-APR-DISTILL-TRAIN-005 PARTIAL_ALGORITHM_LEVEL (precompute byte-determinism, 2 unit tests, local + remote-stub branches) - #1439: FALSIFY-APR-DISTILL-TRAIN-006 PARTIAL_ALGORITHM_LEVEL (train cache-resume idempotency, 2 unit tests, negative + positive halves) - #1440: cpu_vs_gpu_cosine_similarity helper at module scope + 3 tests (parallel=1, orthogonal=0, fail-closed; cosine math now callable without --features cuda for the future part b wgpu cosine gate) §43 documents: what landed (table), coverage flips (TRAIN-005, TRAIN-006 unbound → PARTIAL_ALGORITHM_LEVEL), why for MODEL-1+MODEL-2 (parallel contract drift closure + part b infrastructure), Five Whys, ship % effects (MODEL-1 87→88, MODEL-2 54→56), and next-session pickup options (CPU-GPU-005 part b OR distill-train real implementation). Coverage tally: 15+33 → 15+35 (+2 PARTIAL_ALGORITHM_LEVEL closed). Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

…VEL + TRAIN-009 BLOCKER_FIXTURE_ABSENT (#1443) Closes the last three contract drifts in apr-cli-distill-train-v1 (tasks #218 + #247 claimed PARTIAL but YAML had no algorithm_evidence blocks). Same fix-pattern as TRAIN-005/006 (PRs #1438, #1439). TRAIN-007 (pv validate exits 0) — PARTIAL_ALGORITHM_LEVEL Live verification 2026-05-04 on this branch: pv validate exits 0 with "0 error(s), 0 warning(s)". Meta-discharge via the pre-commit hook + manual operator runs that have validated every amendment since v1.0.0 PROPOSED. TRAIN-008 (3-surface drift cli + registry + test) — PARTIAL_ALGORITHM_LEVEL Live verification 2026-05-04: cargo test -p apr-cli --test cli_commands registered_commands → "1 passed; 0 failed". The test_no_unregistered_commands integration test walks the live clap parser and enforces every Subcommand variant matches apr-cli-commands-v1.yaml, binding the invariant from feedback_cli_subcommand_three_surface_drift. TRAIN-009 (end-to-end smoke beats from-scratch baseline) — BLOCKER_FIXTURE_ABSENT Honest blocker note: discharge requires the missing real-training implementation per §35 (apr distill --stage train is currently a stub). Without gradient descent there is no val_loss to compare. Path to DISCHARGED documented in the algorithm_evidence notes. Five Whys 1. Why bind these now? Tasks #218/#247 claimed PARTIAL on 2026-04-30 but the YAML had no algorithm_evidence. Same drift pattern as TRAIN-005/006 (PRs #1438, #1439) — closing it gives the contract a complete provability surface. 2. Why mark TRAIN-009 BLOCKER_FIXTURE_ABSENT instead of unbound? It has a clear test design (tests/distill_smoke.rs) and a clear blocker (real training implementation per §35). Marking it as a blocker rather than leaving it untyped makes the dependency explicit so a future PR cannot accidentally promote it without the real-training prerequisite. 3. Why two PARTIAL + one BLOCKER, not three PARTIAL? PARTIAL implies an existing test exercises the invariant. TRAIN-009 has no test today (no `tests/distill_smoke.rs`) and cannot have one until §35 lands. Honest classification beats false PARTIAL claims. 4. Why all three in one PR? They're the last three falsifiers in this contract; bundling them produces a single audit story (9/9 falsifiers now have status). Per Toyota Way each falsifier is a distinct binding decision but they share the same review surface. 5. Why bounded? ~45 LOC of YAML, no production code change, no new tests (uses existing cargo test + pv validate). pv validate exits 0 verified locally. Net effect - All 9 TRAIN-* falsifiers in apr-cli-distill-train-v1 now have algorithm_evidence blocks (8× PARTIAL_ALGORITHM_LEVEL + 1× BLOCKER_FIXTURE_ABSENT). - Contract drift between task list (#218/#247) and YAML closed. - Coverage tally: 15+35 → 15+37 (+2 PARTIAL_ALGORITHM_LEVEL closed, TRAIN-009 explicitly blocked not counted). - MODEL-2 ship %: 56% → 57% (last falsifier-binding gap closed for the distill contract; real-training implementation per §35 is the only remaining MODEL-2 lever). - pv validate exits 0. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

noahgift enabled auto-merge (squash) May 3, 2026 19:43

noahgift mentioned this pull request May 3, 2026

falsify(apr-cli-distill-train-v1): TRAIN-006 PARTIAL_ALGORITHM_LEVEL — train cache-resume idempotency #1439

Merged

3 tasks

noahgift added 2 commits May 3, 2026 22:13

Merge branch 'main' into falsify/apr-distill-train-005-precompute-det…

f612446

…erminism

Merge branch 'main' into falsify/apr-distill-train-005-precompute-det…

ec758c3

…erminism

noahgift mentioned this pull request May 3, 2026

spec(ship-two-models): v2.88.0 — §43 distill-train algorithm-bind + cosine helper for FALSIFY-CPU-GPU-005 part b #1441

Merged

1 task

noahgift added 2 commits May 3, 2026 23:19

Merge origin/main: combine TRAIN-005 + TRAIN-006 falsifier tests

e7b87d3

Merge branch 'main' into falsify/apr-distill-train-005-precompute-det…

88d3864

…erminism

noahgift merged commit afb1d25 into main May 3, 2026
10 checks passed

noahgift deleted the falsify/apr-distill-train-005-precompute-determinism branch May 3, 2026 22:05

noahgift mentioned this pull request May 3, 2026

falsify(apr-cli-distill-train-v1): TRAIN-007/008 PARTIAL_ALGORITHM_LEVEL + TRAIN-009 BLOCKER_FIXTURE_ABSENT #1443

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

falsify(apr-cli-distill-train-v1): TRAIN-005 PARTIAL_ALGORITHM_LEVEL — precompute byte-determinism#1438

falsify(apr-cli-distill-train-v1): TRAIN-005 PARTIAL_ALGORITHM_LEVEL — precompute byte-determinism#1438
noahgift merged 5 commits into
mainfrom
falsify/apr-distill-train-005-precompute-determinism

noahgift commented May 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

noahgift commented May 3, 2026

Summary

Falsifier tests (both pass)

Five Whys

Net effect on shipping

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant