Skip to content

feat(pretrain): INV-INIT-ARCH-MATCH-001 — fail-fast on arch-mismatched --init (§86.6)#1760

Merged
noahgift merged 10 commits into
mainfrom
feat/init-arch-match-invariant
May 18, 2026
Merged

feat(pretrain): INV-INIT-ARCH-MATCH-001 — fail-fast on arch-mismatched --init (§86.6)#1760
noahgift merged 10 commits into
mainfrom
feat/init-arch-match-invariant

Conversation

@noahgift

Copy link
Copy Markdown
Contributor

Summary

Closes the SPEC §86.6 follow-up: catches the §86 silent-failure pattern at the gate before training silently falls back to random init.

The §86 case this catches

P2-G v1 was dispatched to resume P2-E ep49 for 10,000 more steps. Init eval at step 0 produced val_loss = 8.60 — 1.86× P2-E ep49's recorded 4.62. Silent failure: --init loaded random weights. Root cause: metadata.architecture = \"LlamaForCausalLM\" (the §82 P0-H fallback) drove the trainer to build Llama-family parameter names; the APR's Qwen2-style tensor names didn't match; populate fell back to random init. Detection latency before this PR: 1 epoch (~55s on RTX 4090) once the operator notices the init eval val_loss disagrees with the init checkpoint's recorded val_loss.

After this PR: detection latency = 0 (gated at load time, before training starts).

What this adds

Three public functions in aprender-train::train::pretrain_real:

  • family_from_tensor_names(names)&'static str — lightweight name-only family inference (qwen3 / qwen2 / llama / mamba / rwkv / gpt-neox / opt / bert / gpt2 / unknown). Mirrors the heavyweight infer_architecture_from_names in aprender-core::format::converter::tokenizer_loader.
  • normalize_metadata_arch_family(arch)Option<&'static str> — maps all three forms of the metadata architecture field (HF class names like Qwen2ForCausalLM, family slugs like qwen2, capitalised legacy like Qwen2) to a canonical family slug.
  • validate_init_arch_matches_tensor_evidence(metadata_arch, &tensors) — the actual gate. Errors with FALSIFY-INIT-ARCH-MATCH-001 naming both the claimed and inferred families plus an inline apr stamp salvage recipe (PR #1757).

Wired into build_shared_trainer_with_init between load_init_tensors_from_apr and populate_trainer_from_init_tensors. Reads the raw metadata.architecture string via a small helper — TransformerConfig::hf_architecture is None for pre-P0-K APRs (the §86 case), so the cross-check needs the raw field.

Three skip-the-check fallback cases (no false-positives)

  1. No metadata claim (metadata.architecture absent) → allow
  2. Unmappable claim (e.g. WeirdNovelArch) → allow (novel arch isn't §86)
  3. Tensor inference returns unknown (GGUF blk.* names) → trust metadata

Only fails when BOTH inferences produce concrete family slugs AND they differ.

Salvage recipe (in the error message)

apr stamp <pre-p0k.apr> --architecture qwen2 --hf-architecture Qwen2ForCausalLM \
                       -o <stamped.apr>
apr pretrain --init <stamped.apr> ...

Requires PR #1757 (apr stamp HF identity extension).

Test plan

  • 7 INV-INIT-ARCH-MATCH-001 tests:
    • rejects_llama_stamped_qwen2_tensors (canonical §86 case)
    • rejects_qwen2_stamped_llama_tensors (inverse)
    • accepts_matching_qwen2 (no false-positive)
    • accepts_matching_llama (no false-positive)
    • skips_when_metadata_absent
    • skips_unmappable_metadata
    • trusts_metadata_when_tensors_unknown
  • family_from_tensor_names_distinguishes_qwen2_from_llama (helper)
  • normalize_metadata_arch_family_handles_three_forms (normalizer)
  • All 9 new tests pass
  • 7,595 existing aprender-train lib tests still pass
  • 3 pre-existing prune::snapshot_tests failures (insta-snapshot drift in main) NOT caused by this PR

Discharges

  • SPEC §86.6 follow-up (the "Recommended follow-up — new INV-INIT-ARCH-MATCH-001 invariant" section)
  • INV-INIT-ARCH-MATCH-001 invariant for contracts/apr-pretrain-from-init-v1.yaml — the contract amendment is a small separate follow-up PR

Refs

  • PR #1742 (PMAT-690 P0-K base — apr_convert + apr_import stamping)
  • PR #1757 (apr stamp HF identity extension — the salvage path this invariant points operators to)
  • PR #1758 (SPEC §86 amendment — context this invariant operationalizes)
  • evidence/p2g-2026-05-17/section-86-draft.md
  • memory/feedback_upstream_metadata_masquerade.md (methodology Add feature importance example to random_forest_regression.rs #33)

🤖 Generated with Claude Code

…d --init (SPEC §86.6)

Catches the §86 silent-failure pattern at the gate: when an APR's
metadata `architecture` claim contradicts what its tensor names imply,
`apr pretrain --init` exits non-zero with a clear naming-both-claims
error and an inline `apr stamp` salvage recipe.

## Background — the §86 case this catches

P2-G v1 was dispatched to resume P2-E ep49 for 10,000 more steps. The
init eval at step 0 produced val_loss = 8.60 — 1.86× P2-E ep49's
recorded 4.62. Silent failure: `--init` loaded random weights instead
of the trained checkpoint. Root cause walk-through:

1. `read_apr_architecture` parses `metadata.architecture = "LlamaForCausalLM"`
   (the §82 P0-H fallback when init_arch.hf_architecture is None).
2. `transformer_config_from_apr_metadata` builds a Llama-family
   TransformerConfig (dimensions correct, family discriminator wrong).
3. `populate_trainer_from_init_tensors` walks `trainer.named_parameters()` —
   produces Llama-style names — and looks them up in the APR tensor map
   which has Qwen2-style names. Mismatch → silent random-init fallback.
4. Training begins at random-init magnitude (val_loss ≈ 8.60).

This invariant catches step 1's wrong claim BEFORE step 3 silently
falls through.

## What this adds

Three new public functions in `aprender-train::train::pretrain_real`:

- `family_from_tensor_names(names: impl IntoIterator<Item=&str>)`
  → `&'static str` — lightweight tensor-name-only family inference
  (no data needed). Returns one of qwen3 / qwen2 / llama / mamba /
  rwkv / gpt-neox / opt / bert / gpt2 / unknown. Mirrors the
  heavyweight `infer_architecture_from_names` in
  aprender-core::format::converter::tokenizer_loader.
- `normalize_metadata_arch_family(arch: &str)` → `Option<&'static str>`
  — maps all three forms of the metadata `architecture` field to a
  canonical family slug: HF class names ("Qwen2ForCausalLM"), family
  slugs ("qwen2"), and capitalised legacy ("Qwen2"). Returns None
  for "unknown" / unmappable strings — caller treats as "no claim".
- `validate_init_arch_matches_tensor_evidence(metadata_arch, &tensors)`
  → `Result<(), String>` — the actual invariant gate. Errors with
  `FALSIFY-INIT-ARCH-MATCH-001` naming both the claimed and inferred
  families, plus an inline `apr stamp` recipe (PR #1757) for §86 salvage.

Wired into `build_shared_trainer_with_init` between `load_init_tensors_from_apr`
and `populate_trainer_from_init_tensors`. Read the raw metadata
`architecture` string via a new small helper (the `TransformerConfig`'s
`hf_architecture` field is None for pre-P0-K APRs — the §86 case — so
the cross-check needs the raw string field).

## Three skip-the-check fallback cases (no false-positives)

1. **No metadata claim** (metadata.architecture absent): nothing to
   contradict, allow.
2. **Unmappable claim** (e.g. "WeirdNovelArch"): novel arch is not §86,
   allow.
3. **Tensor inference returns "unknown"** (GGUF blk.* names can't
   disambiguate): trust the metadata, allow.

Only fail when BOTH inferences produce concrete family slugs AND they differ.

## Tests

- 7 new INV-INIT-ARCH-MATCH-001 tests in `pretrain_real::tests`:
  - `inv_init_arch_match_001_rejects_llama_stamped_qwen2_tensors` —
    canonical §86 case, must fail with falsifier ID + salvage recipe
  - `inv_init_arch_match_001_rejects_qwen2_stamped_llama_tensors` —
    inverse §86 case, must fail
  - `inv_init_arch_match_001_accepts_matching_qwen2/llama` — no
    false-positive on correctly-stamped APRs
  - `inv_init_arch_match_001_skips_when_metadata_absent` — None metadata
  - `inv_init_arch_match_001_skips_unmappable_metadata` — novel arch
  - `inv_init_arch_match_001_trusts_metadata_when_tensors_unknown` —
    GGUF blk.* case
- 1 helper test: `family_from_tensor_names_distinguishes_qwen2_from_llama`
- 1 normalizer test: `normalize_metadata_arch_family_handles_three_forms`

All 9 new tests pass. 7,595 existing aprender-train lib tests still pass
(the 3 pre-existing prune::snapshot_tests failures are insta-snapshot
drift in main, unrelated to this PR).

## Discharges

- §86.6 SPEC follow-up (forthcoming via #1758 stack)
- INV-INIT-ARCH-MATCH-001 invariant for `contracts/apr-pretrain-from-init-v1.yaml`
  (contract amendment is a separate small follow-up PR)

## Refs

- PR #1742 (PMAT-690 P0-K base — apr_convert + apr_import stamping)
- PR #1757 (apr stamp HF identity extension — the salvage path this
  invariant points operators to)
- PR #1758 (SPEC §86 amendment — context this invariant operationalizes)
- evidence/p2g-2026-05-17/section-86-draft.md
- memory/feedback_upstream_metadata_masquerade.md (methodology #33)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
noahgift added a commit that referenced this pull request May 17, 2026
…H-MATCH-001 (SPEC §86.6 closure) (#1761)

Codifies the INV-INIT-ARCH-MATCH-001 invariant authored as runtime code
in PR #1760 (`validate_init_arch_matches_tensor_evidence` in
aprender-train::train::pretrain_real). Adds:

- FALSIFY-INIT-ARCH-MATCH-001: integration falsifier bound to the
  unit-test family `cargo test -p aprender-train --lib
  inv_init_arch_match_001` (7 tests covering: canonical §86 reject,
  inverse reject, matching qwen2 accept, matching llama accept, None
  metadata skip, unmappable metadata skip, GGUF-unknown tensor skip).
- INV-INIT-ARCH-MATCH-001 proof_obligation: safety invariant — when
  both metadata.architecture and tensor-name-inferred family resolve
  to concrete distinct slugs, gate MUST fail-fast before any training
  step. No false-positive when either side returns "unknown".

## Salvage path

The error message includes an inline `apr stamp` recipe (PR #1757):

```
apr stamp <pre-p0k.apr> --architecture qwen2 --hf-architecture Qwen2ForCausalLM \
                       -o <stamped.apr>
apr pretrain --init <stamped.apr> ...
```

## Refs

- PR #1742 (PMAT-690 P0-K base — producer-side stamping)
- PR #1750 (P3-A `apr inspect --quality` — surfaces hf_identity=0/20 pre-stamp)
- PR #1754 (SPEC §85 P2-E findings — context)
- PR #1757 (apr stamp HF identity extension — salvage path)
- PR #1758 (SPEC §86 amendment — defect specification this contract closes)
- PR #1760 (INV-INIT-ARCH-MATCH-001 runtime implementation)
- memory/feedback_upstream_metadata_masquerade.md (methodology #33)

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
noahgift added a commit that referenced this pull request May 18, 2026
…t (P3-C prep) (#1764)

Author the HuggingFace model card for `paiml/albor-370m-v1` and the
publish-readiness pre-flight script. Per SPEC §88: this model is
shipped as a stack-existence-proof, not a production code-completion
model. Both artifacts make that framing explicit so HF Hub users
calibrate expectations correctly.

## docs/model-cards/albor-370m-v1.md (255 lines)

Standard HF model card with model-index frontmatter:

- YAML metadata: Apache-2.0, code/python/stack-existence-proof tags,
  Qwen2.5-Coder-0.5B-Instruct base, codeparrot + the-stack-dedup
  datasets, val_loss=4.6227 / val_perplexity=101.78 metrics.
- §88 framing section spelling out the stack-existence-proof purpose.
- Training procedure table (architecture, optimizer, LR schedule,
  hardware, wall time, throughput — all from the §85 P2-E run).
- Trajectory table (every 5 epochs from 7.43 → 4.62).
- Intended uses (✅ stack demos, infra validation, tokenization
  round-trip, quantization research) vs NOT-recommended uses
  (production code-LM, zero-shot reasoning, long-context, HumanEval
  submission).
- Limitations (compute-bounded, plateau evidence, init lineage,
  val drift).
- Training data table (sources, sizes, licenses, role).
- How-to-use code snippets (apr CLI, Rust direct load, format export).
- Reproduce-the-run shell example using the exact §85 P2-E recipe.
- Citation, license/provenance, acknowledgments.

## scripts/publish/albor-370m-publish-readiness.sh (182 lines)

7-gate pre-flight checklist. GO / NO-GO verdict before invoking
`apr publish`. Gates:

1. `apr validate` exits 0
2. `apr inspect --quality` ≥ 90 (P3-A scorer; surfaces §86 salvage
   recipe inline if hf_identity < 20 or provenance < 25)
3. `apr qa --json` verdict = GO (8 gates)
4. Model card present + has HF YAML frontmatter
5. HF_TOKEN set
6. Smoke `apr run` produces text-like output
7. GGUF Q4_K + SafeTensors export round-trip both succeed

Exit 0 = ready to publish. Exit 1 = NO-GO with explicit blocker list.
Bashrs-validated (1 SEC011 false-positive on multi-condition rm -rf
guard; functionally safe).

## What this PR does NOT do

- Does NOT invoke `apr publish` (external action; requires user OK)
- Does NOT touch any APR files (read-only checks)
- Does NOT modify the §85 P2-E ep49 checkpoint (operator runs
  `apr stamp` via the §86.4 salvage recipe separately)

## Operator workflow (post-PR landing)

```bash
# 1. Stamp the pre-P0-K P2-E ep49 checkpoint to bring hf_identity up
apr stamp /mnt/nvme-raid0/runs/model-2-p2e-tuned-hp-20260517/ckpt/epoch-049.apr \
    --architecture qwen2 \
    --hf-architecture Qwen2ForCausalLM \
    --hf-model-type qwen2 \
    --license Apache-2.0 \
    --data-source "huggingface.co/Qwen/Qwen2.5-Coder-0.5B-Instruct + bigcode/the-stack-dedup + codeparrot/codeparrot-clean" \
    --data-license "Apache-2.0 / permissive-aggregate" \
    -o /tmp/albor-370m-v1.apr

# 2. Run the readiness check
bash scripts/publish/albor-370m-publish-readiness.sh /tmp/albor-370m-v1.apr
# Expected output: "VERDICT: GO" (or NO-GO with explicit blocker list)

# 3. Publish (still requires explicit user invocation)
apr publish paiml/albor-370m-v1 --formats apr,safetensors,gguf \
    --model-card docs/model-cards/albor-370m-v1.md
```

## Refs

- PR #1742 (PMAT-690 P0-K — upstream stamping)
- PR #1750 (P3-A `apr inspect --quality` — gate 2)
- PR #1754 (SPEC §84+§85+§86+§87+§88 stack — context)
- PR #1757 (apr stamp HF identity extension — §86 salvage)
- PR #1760 (INV-INIT-ARCH-MATCH-001 — validation chain)
- docs/specifications/aprender-train/ship-model-2-spec.md §88
- docs/specifications/aprender-train/albor-370m-roadmap.md §4 P3-C

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
@noahgift noahgift merged commit 2d42992 into main May 18, 2026
10 checks passed
@noahgift noahgift deleted the feat/init-arch-match-invariant branch May 18, 2026 09:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant