docs(spec): §86 — apr pretrain --init mismatch defect + #1757 salvage workflow by noahgift · Pull Request #1758 · paiml/aprender

noahgift · 2026-05-17T14:45:29Z

Summary

Documents the §86 finding surfaced by P2-G v1's failed dispatch: pre-P0-K APR checkpoints are silently non-resumable via apr pretrain --init due to the §82 P0-H "LlamaForCausalLM" fallback stamp colliding with Qwen2 tensor names. Symptoms manifest as val_loss = 8.60 at init eval instead of the expected ≈ 4.62, i.e. a 1.86× wrong starting point.

What §86 covers

Root cause walk-through — read_apr_architecture → wrong family discriminator → populate_trainer_from_init_tensors parameter-name mismatch → silent random-init fallback
Implications — all ~125 GB of pre-feat(apr-convert): stamp hf_architecture/hf_model_type from config.json (PMAT-690 P0-K) #1742 trained checkpoints (50 P2-E epochs) are non-resumable
Three workarounds in priority order:
- Re-import (blocked on HF safetensors locally)
- Restamp in-place ✅ SHIPPED via PR #1757
- Treat as final (what P2-G v2 takes — currently in flight)
Operator recipe — 3-line apr stamp invocation that brings hf_identity sub-score from 0/20 → 20/20
Failure-mode classification — Class 4 Silent Incorrect Behavior, detection latency 1 epoch
Recommended follow-up — INV-INIT-ARCH-MATCH-001 invariant on apr-pretrain-from-init-v1 contract (defer to follow-up PR)

Plus the evidence/p2g-2026-05-17/section-86-draft.md source — the raw analysis that this spec section formalizes.

Stacked on #1754

Base: feat/spec-85-p2e-findings (PR #1754 — SPEC §85 P2-E findings). §86 depends on §85's P2-E context.

Will auto-rebase to main once #1754 lands.

Test plan

grep -n \"^## §86\" docs/specifications/aprender-train/ship-model-2-spec.md — §86 section present at line 2310
wc -l evidence/p2g-2026-05-17/section-86-draft.md — 58 lines, full draft preserved
Cross-references to PR feat(apr-convert): stamp hf_architecture/hf_model_type from config.json (PMAT-690 P0-K) #1742, feat(apr-inspect): --quality 0-100 model quality scorer (PMAT-690 P3-A) #1750, docs(spec): §84 + §85 — P2-C/P2-E live findings + P0-K closure live-verified #1754, feat(apr-stamp): --hf-architecture/--hf-model-type/--architecture (PMAT-690 P0-K extension) #1757 all valid

Refs

PR #1742 (PMAT-690 P0-K base)
PR #1750 (P3-A apr inspect --quality — the diagnostic that surfaces §86)
PR #1754 (SPEC §85 P2-E findings — context)
PR #1757 (apr stamp HF identity extension — workaround Feature Request: Cross-Validation Utilities #2)
evidence/p2g-2026-05-17/section-86-draft.md
memory/feedback_upstream_metadata_masquerade.md (methodology Add feature importance example to random_forest_regression.rs #33)

🤖 Generated with Claude Code

…matched APRs; PR #1757 ships in-place stamp salvage P2-G v1 dispatch surfaced a SECOND symptom of the §81-§84 cascade root cause: pre-P0-K APR checkpoints (architecture="LlamaForCausalLM" P0-H fallback + Qwen2-tensor shape) are silently non-resumable via `apr pretrain --init`. The init eval at step 0 produced val_loss=8.60 instead of P2-E ep49's recorded 4.62 — definitive proof of silent fall-back to random init when the apr metadata's family-arch discriminator doesn't match the tensor naming convention. ## What §86 covers 1. Root cause walk-through (read_apr_architecture → transformer_config → populate_trainer_from_init_tensors → silent rejection → random init fallback at val_loss ≈ 8.60). 2. Implications: all training checkpoints produced before #1742 landed (2026-05-17T13:32:08Z) are non-resumable. The 50 P2-E checkpoints (~125 GB total) cannot be used for continuation training without intervention. 3. Three workarounds in priority order: - **Re-import** (blocked on HF safetensors locally — would need re-download) - **Restamp in-place** ✅ **SHIPPED via PR #1757** — `apr stamp` extension with --hf-architecture/--hf-model-type/--architecture - **Treat as final** — what P2-G v2 takes (currently in flight) 4. Operator recipe for the §86 salvage (3-line shell example). 5. Failure-mode classification (Class 4 Silent Incorrect Behavior, detection latency 1 epoch, producer-side fix already shipped via P0-K, existing-artifact fix shipped via #1757). 6. Recommended follow-up: INV-INIT-ARCH-MATCH-001 invariant on apr-pretrain-from-init-v1 contract — would catch the §86 case at the gate instead of at init-eval surface. Defer to follow-up PR. ## Stacked on PR #1754 (SPEC §85) Base: `feat/spec-85-p2e-findings`. The §86 amendment depends on §85 context (the P2-E run that surfaced §86). Will auto-rebase to main after #1754 lands. ## Refs - PR #1742 (PMAT-690 P0-K base — apr_import + apr_convert stamping) - PR #1750 (P3-A `apr inspect --quality` scorer — the diagnostic that surfaces §86 quality=40 pre-stamp, 60 post-stamp) - PR #1754 (SPEC §85 P2-E findings — the run that surfaced §86) - PR #1757 (apr stamp HF identity extension — workaround #2 above) - evidence/p2g-2026-05-17/section-86-draft.md - memory/feedback_upstream_metadata_masquerade.md (methodology #33) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…H-MATCH-001 (SPEC §86.6 closure) (#1761) Codifies the INV-INIT-ARCH-MATCH-001 invariant authored as runtime code in PR #1760 (`validate_init_arch_matches_tensor_evidence` in aprender-train::train::pretrain_real). Adds: - FALSIFY-INIT-ARCH-MATCH-001: integration falsifier bound to the unit-test family `cargo test -p aprender-train --lib inv_init_arch_match_001` (7 tests covering: canonical §86 reject, inverse reject, matching qwen2 accept, matching llama accept, None metadata skip, unmappable metadata skip, GGUF-unknown tensor skip). - INV-INIT-ARCH-MATCH-001 proof_obligation: safety invariant — when both metadata.architecture and tensor-name-inferred family resolve to concrete distinct slugs, gate MUST fail-fast before any training step. No false-positive when either side returns "unknown". ## Salvage path The error message includes an inline `apr stamp` recipe (PR #1757): ``` apr stamp <pre-p0k.apr> --architecture qwen2 --hf-architecture Qwen2ForCausalLM \ -o <stamped.apr> apr pretrain --init <stamped.apr> ... ``` ## Refs - PR #1742 (PMAT-690 P0-K base — producer-side stamping) - PR #1750 (P3-A `apr inspect --quality` — surfaces hf_identity=0/20 pre-stamp) - PR #1754 (SPEC §85 P2-E findings — context) - PR #1757 (apr stamp HF identity extension — salvage path) - PR #1758 (SPEC §86 amendment — defect specification this contract closes) - PR #1760 (INV-INIT-ARCH-MATCH-001 runtime implementation) - memory/feedback_upstream_metadata_masquerade.md (methodology #33) Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

…d --init (SPEC §86.6) (#1760) Catches the §86 silent-failure pattern at the gate: when an APR's metadata `architecture` claim contradicts what its tensor names imply, `apr pretrain --init` exits non-zero with a clear naming-both-claims error and an inline `apr stamp` salvage recipe. ## Background — the §86 case this catches P2-G v1 was dispatched to resume P2-E ep49 for 10,000 more steps. The init eval at step 0 produced val_loss = 8.60 — 1.86× P2-E ep49's recorded 4.62. Silent failure: `--init` loaded random weights instead of the trained checkpoint. Root cause walk-through: 1. `read_apr_architecture` parses `metadata.architecture = "LlamaForCausalLM"` (the §82 P0-H fallback when init_arch.hf_architecture is None). 2. `transformer_config_from_apr_metadata` builds a Llama-family TransformerConfig (dimensions correct, family discriminator wrong). 3. `populate_trainer_from_init_tensors` walks `trainer.named_parameters()` — produces Llama-style names — and looks them up in the APR tensor map which has Qwen2-style names. Mismatch → silent random-init fallback. 4. Training begins at random-init magnitude (val_loss ≈ 8.60). This invariant catches step 1's wrong claim BEFORE step 3 silently falls through. ## What this adds Three new public functions in `aprender-train::train::pretrain_real`: - `family_from_tensor_names(names: impl IntoIterator<Item=&str>)` → `&'static str` — lightweight tensor-name-only family inference (no data needed). Returns one of qwen3 / qwen2 / llama / mamba / rwkv / gpt-neox / opt / bert / gpt2 / unknown. Mirrors the heavyweight `infer_architecture_from_names` in aprender-core::format::converter::tokenizer_loader. - `normalize_metadata_arch_family(arch: &str)` → `Option<&'static str>` — maps all three forms of the metadata `architecture` field to a canonical family slug: HF class names ("Qwen2ForCausalLM"), family slugs ("qwen2"), and capitalised legacy ("Qwen2"). Returns None for "unknown" / unmappable strings — caller treats as "no claim". - `validate_init_arch_matches_tensor_evidence(metadata_arch, &tensors)` → `Result<(), String>` — the actual invariant gate. Errors with `FALSIFY-INIT-ARCH-MATCH-001` naming both the claimed and inferred families, plus an inline `apr stamp` recipe (PR #1757) for §86 salvage. Wired into `build_shared_trainer_with_init` between `load_init_tensors_from_apr` and `populate_trainer_from_init_tensors`. Read the raw metadata `architecture` string via a new small helper (the `TransformerConfig`'s `hf_architecture` field is None for pre-P0-K APRs — the §86 case — so the cross-check needs the raw string field). ## Three skip-the-check fallback cases (no false-positives) 1. **No metadata claim** (metadata.architecture absent): nothing to contradict, allow. 2. **Unmappable claim** (e.g. "WeirdNovelArch"): novel arch is not §86, allow. 3. **Tensor inference returns "unknown"** (GGUF blk.* names can't disambiguate): trust the metadata, allow. Only fail when BOTH inferences produce concrete family slugs AND they differ. ## Tests - 7 new INV-INIT-ARCH-MATCH-001 tests in `pretrain_real::tests`: - `inv_init_arch_match_001_rejects_llama_stamped_qwen2_tensors` — canonical §86 case, must fail with falsifier ID + salvage recipe - `inv_init_arch_match_001_rejects_qwen2_stamped_llama_tensors` — inverse §86 case, must fail - `inv_init_arch_match_001_accepts_matching_qwen2/llama` — no false-positive on correctly-stamped APRs - `inv_init_arch_match_001_skips_when_metadata_absent` — None metadata - `inv_init_arch_match_001_skips_unmappable_metadata` — novel arch - `inv_init_arch_match_001_trusts_metadata_when_tensors_unknown` — GGUF blk.* case - 1 helper test: `family_from_tensor_names_distinguishes_qwen2_from_llama` - 1 normalizer test: `normalize_metadata_arch_family_handles_three_forms` All 9 new tests pass. 7,595 existing aprender-train lib tests still pass (the 3 pre-existing prune::snapshot_tests failures are insta-snapshot drift in main, unrelated to this PR). ## Discharges - §86.6 SPEC follow-up (forthcoming via #1758 stack) - INV-INIT-ARCH-MATCH-001 invariant for `contracts/apr-pretrain-from-init-v1.yaml` (contract amendment is a separate small follow-up PR) ## Refs - PR #1742 (PMAT-690 P0-K base — apr_convert + apr_import stamping) - PR #1757 (apr stamp HF identity extension — the salvage path this invariant points operators to) - PR #1758 (SPEC §86 amendment — context this invariant operationalizes) - evidence/p2g-2026-05-17/section-86-draft.md - memory/feedback_upstream_metadata_masquerade.md (methodology #33) Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

noahgift merged commit 7fcd4aa into feat/spec-85-p2e-findings May 17, 2026
1 check passed

noahgift deleted the feat/spec-86-init-mismatch branch May 17, 2026 14:45

This was referenced May 17, 2026

feat(pretrain): INV-INIT-ARCH-MATCH-001 — fail-fast on arch-mismatched --init (§86.6) #1760

Merged

contracts(apr-pretrain-from-init): v1.2.0 → v1.3.0 — FALSIFY-INIT-ARCH-MATCH-001 (§86.6 closure) #1761

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(spec): §86 — apr pretrain --init mismatch defect + #1757 salvage workflow#1758

docs(spec): §86 — apr pretrain --init mismatch defect + #1757 salvage workflow#1758
noahgift merged 1 commit into
feat/spec-85-p2e-findingsfrom
feat/spec-86-init-mismatch

noahgift commented May 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

noahgift commented May 17, 2026

Summary

What §86 covers

Stacked on #1754

Test plan

Refs

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant