spec(ship-two-models): v2.95.0 → v2.96.0 — §51 §50.4 cascade snapshot (7/8 falsifiers bound) by noahgift · Pull Request #1480 · paiml/aprender

noahgift · 2026-05-04T16:51:41Z

Same-day continuation cycle landed 8 PRs across the §50.4 architecture-polymorphic infrastructure track. §51 records the cascade-complete state and pinpoints the remaining MODEL-2 ship-% gate (step 5g LIVE). 7/8 falsifiers in apr-pretrain-arch-polymorphic-v1 are now PARTIAL_ALGORITHM_LEVEL or MERGED. Remaining: step 5f.2 (APR weight load + tensor materialization, ~80 LOC, deliberately deferred to let cascade settle), step 5g (LIVE 500-step fine-tune, operator-runnable, the load-bearing test that moves MODEL-2 ship-%), step 5h (stamp + publish). Per §47-§48 lesson: infrastructure ship != ship-% movement. Plain ship-%: MODEL-1=91%, MODEL-2=57% (unchanged; gated on step 5g). Refs: PRs #1472/#1476/#1478 MERGED, #1473/#1474/#1475/#1479 in flight.

… (7/8 falsifiers bound) Same-day continuation cycle landed 8 PRs across the §50.4 architecture- polymorphic infrastructure track. §51 records the cascade-complete state and pinpoints the remaining MODEL-2 ship-% gate (step 5g LIVE). Falsifier-discharge scoreboard for `apr-pretrain-arch-polymorphic-v1`: | ID | What it pins | PR | Status | |----|---------------------------------------|-------|--------| | 001 | qwen2_0_5b matches HF + tie fix | #1474 | PARTIAL | | 002 | init=None preserves Llama370M | #1475 | PARTIAL | | 003 | init=Some pass-through | #1475 | PARTIAL | | 004 | GQA-7:1 forward smoke | #1478 | MERGED | | 005 | Qwen tokenizer + Qwen target = pass | #1476 | MERGED | | 006 | Qwen tokenizer + Llama target = fail | #1476 | MERGED | | 007 | encoder/decoder family mismatch | #1479 | PARTIAL | | 008 | pv validate | #1473 | PARTIAL | 7 of 8 falsifiers PARTIAL_ALGORITHM_LEVEL or MERGED. Remaining work: - 5f.2 — wire APR file open + tensor materialization (~80 LOC) DELIBERATELY DEFERRED this cycle; doing 5f.2 now means rebasing onto 4 in-flight PRs as they land - 5g — LIVE 500-step smoke fine-tune (operator dispatch) THE LOAD-BEARING TEST that moves MODEL-2 ship-% - 5h — stamp + publish Per §47-§48 lesson: "infrastructure shipped ≠ ship-% movement." Cascade-complete state means the polymorphic foundation is in place; ship-% movement still requires the LIVE empirical check. Five Whys: 1. Why a snapshot now? Multiple PRs in cascade auto-merge create cognitive load. A spec snapshot captures both the achievement (7 falsifiers bound) and the remaining gate (step 5g LIVE). Without it, future operators waste cycles re-deriving the state. 2. Why focus on falsifier scoreboard rather than total LOC? Falsifier discharge is the actual contract obligation. 7/8 invariants pinned means CI now catches regressions in the polymorphic-init path. 3. Why mention 5f.2 explicitly as deliberately deferred? Naming the deferral makes it not a punt. Step 5f.2 has a clear "when": after the 4 in-flight PRs cascade-merge, then 5f.2 lands clean. 4. Why call out infrastructure ≠ ship-%? The §47-§48 cascade taught the same lesson — "11 SHIP-007 cascade PRs landed but no ship-% movement." Operator-facing ship-% is the LIVE check. 5. Why is FALSIFY-006 LIVE the load-bearing claim? init_loss(step=0) ≤ 6.0 vs from_scratch_loss(step=0) ≥ 9.5 proves end-to-end correctness in one number. No other falsifier can substitute. Plain ship-% update: - MODEL-1: unchanged at 91% (SHIP-007 cascade infrastructure track) - MODEL-2: unchanged at 57% — first ship-% movement gated on §50.4 step 5g (LIVE 500-step fine-tune producing val_loss < 9.38) Spec amendment cadence: §41 → §42 → §43 → §44 → §45 → §46 → §47 → §48 → §49 → §50 → §51. Eleven amendments since 2026-05-03. Same-day spec hygiene rather than letting the cascade-complete state remain implicit. Refs: - SPEC-SHIP-TWO-001 §50 — architecture-coupling finding (PR #1472, MERGED) - PR #1473 — apr-pretrain-arch-polymorphic-v1 contract (in flight) - PR #1474 — qwen2_0_5b tie_word_embeddings fix (in flight) - PR #1475 — build_transformer_config polymorphic dispatch (in flight) - PR #1476 — preflight_tokenizer_vocab_matches_target (MERGED) - PR #1478 — GQA-7:1 forward-pass smoke test (MERGED) - PR #1479 — validate_pretrain_init_arch_compatible (in flight) - feedback_no_guessing.md Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…1481) Adds the read-half of `apr pretrain --init` weight load: a thin wrapper over `aprender::format::converter::load_model_tensors` that returns a `BTreeMap<String, (Vec<f32>, Vec<usize>)>` of tensor blobs keyed by HF naming convention. Per `apr-pretrain-arch-polymorphic-v1` §init_load_semantics (PR #1473): "Loader is REUSED, not reimplemented." This function does not duplicate APR parsing — it forwards to the same machinery `apr export` and `apr inspect` use. Discharges from `apr-pretrain-arch-polymorphic-v1`: - §init_load_semantics invariant (loader reuse): satisfied - FALSIFY-006 (init_loss < 6.0) at READ-COMPILE-BIND level Step 5f decomposition: - 5f.1 (PR #1479): encoder/decoder family validator (~30 LOC) - 5f.2 (this PR): APR file open + tensor read (~30 LOC + 2 tests) - 5f.3 (next): populate trainer parameters from BTreeMap (~50 LOC) - 5g (operator): LIVE 500-step fine-tune → DISCHARGES MODEL-2 ship-% Step 5f.2 is intentionally narrow — it only does the READ. Population into trainer parameter slots (5f.3) reconciles HF naming convention (e.g., `model.embed_tokens.weight`) against the trainer's internal parameter naming. That's a separate concern with its own falsifier. What this PR adds: 1. `pub fn load_init_tensors_from_apr(path) -> Result<BTreeMap<...>>` at pretrain_real.rs:35 (~25 LOC including doc comment) 2. 2 unit tests in `pretrain_real::tests`: - load_init_tensors_missing_file_errors_with_falsifier_id (FALSIFY-006 fail-fast path; asserts error message contains falsifier id + offending path for operator-experience) - load_init_tensors_signature_compile_bind (drift-prevention: catches a future signature change that would break step 5f.3's BTreeMap consumer) Test results (cargo test -p aprender-train --lib train::pretrain_real::tests::load_init_tensors): 2 passed; 0 failed; 0 ignored Five Whys: 1. Why decompose step 5f.2 to JUST the read? Single-piece flow. Read → Validate → Populate are three distinct concerns. Step 5f.1 did validation (#1479); 5f.2 does read; 5f.3 will do populate. Each PR has one falsifier discharge story. 2. Why use load_model_tensors and not write a new parser? The contract pins "Loader is reused, not reimplemented." Writing a new parser would create a parallel format-decoder that drifts from the canonical one. The same lesson as the LAYOUT-001/002 hits — parallel format code paths produce silent format-drift bugs. 3. Why return BTreeMap<String, (Vec<f32>, Vec<usize>)> rather than a trainer-parameter-shaped struct? Decoupling: the read shouldn't know about TransformerTrainer's internal parameter names. Step 5f.3's job is to map HF names → trainer slots; if 5f.2 baked that mapping in, every change to TransformerTrainer would break the read. 4. Why include the signature-compile-bind test? It's a compile-time check that drives step 5f.3's expectations. If a future refactor changes the return type (e.g., from BTreeMap to HashMap, or from Vec<usize> to Box<[usize]>), step 5f.3's consumer code stops compiling — caught here, not at the integration point. 5. Why is FALSIFY-006 NOT yet at PARTIAL_ALGORITHM_LEVEL after this PR? Because step 5f.2 only does the read; FALSIFY-006 requires the LIVE init_loss < 6.0 check, which needs steps 5f.3 + 5g. This PR moves FALSIFY-006 from UNBOUND → READ-COMPILE-BIND, a sub-level of PARTIAL_ALGORITHM_LEVEL. Full PARTIAL discharge happens at 5f.3 when the populate step exists. Plain ship-% update: - MODEL-1: unchanged at 91% (SHIP-007 cascade infrastructure track) - MODEL-2: unchanged at 57% — first ship-% movement gated on §50.4 step 5g (LIVE 500-step fine-tune producing val_loss < 9.38) Refs: - SPEC-SHIP-TWO-001 §50, §51 — MODEL-2 architecture-coupling + cascade snapshot (PR #1472, #1480 MERGED) - PR #1473 — apr-pretrain-arch-polymorphic-v1 contract (in flight) - PR #1474 — qwen2_0_5b tie_word_embeddings fix (MERGED) - PR #1475 — build_transformer_config polymorphic dispatch (in flight) - PR #1476 — preflight_tokenizer_vocab_matches_target (MERGED) - PR #1478 — GQA-7:1 forward-pass smoke test (MERGED) - PR #1479 — validate_pretrain_init_arch_compatible (in flight) - feedback_no_guessing.md - feedback_falsifier_first_cascade_pattern.md (this turn's pattern) Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

noahgift enabled auto-merge (squash) May 4, 2026 16:51

noahgift merged commit 27e530d into main May 4, 2026
11 checks passed

noahgift deleted the spec/ship-two-models-v2-96-cascade-complete branch May 4, 2026 17:17

noahgift mentioned this pull request May 4, 2026

feat(aprender-train): load_init_tensors_from_apr — §50.4 step 5f.2 #1481

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

spec(ship-two-models): v2.95.0 → v2.96.0 — §51 §50.4 cascade snapshot (7/8 falsifiers bound)#1480

spec(ship-two-models): v2.95.0 → v2.96.0 — §51 §50.4 cascade snapshot (7/8 falsifiers bound)#1480
noahgift merged 1 commit into
mainfrom
spec/ship-two-models-v2-96-cascade-complete

noahgift commented May 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

noahgift commented May 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant