Skip to content

feat(rosetta): FamilyRegistry::detect_from_config_str + register_alias#1562

Merged
noahgift merged 36 commits into
mainfrom
feat/arch-demos-detector-upstream
May 14, 2026
Merged

feat(rosetta): FamilyRegistry::detect_from_config_str + register_alias#1562
noahgift merged 36 commits into
mainfrom
feat/arch-demos-detector-upstream

Conversation

@noahgift

@noahgift noahgift commented May 7, 2026

Copy link
Copy Markdown
Contributor

Upstream contribution from apr-cookbook architecture-demos campaign. Lifts the cookbook's PMAT-309 cross-family detector primitive into aprender-core as a public API.

What's added

// Before: cookbook had its own discriminator dispatcher
let v = inference_arch_detector::detect("config.json");

// After: cookbook calls upstream
use aprender::format::FamilyRegistry;
let family = FamilyRegistry::detect_from_config_str(&body);

detect_from_config_str(body: &str) -> Option<&'static str>

Discriminator-field-based dispatch from raw config.json body to family name. Order-sensitive priority list:

  • Qwen3.5 (tie_word_embeddings + head_dim + qwen3_5) → before Qwen3
  • Qwen3 (head_dim + qwen3, NOT qwen3_5) → before Qwen2
  • Qwen2 (qwen2 + rope_theta)
  • Phi (qkv_proj_fused)
  • Gemma (query_pre_attn_scalar)
  • GPT-NeoX (use_parallel_residual)
  • OPT (do_layer_norm_before)
  • GPT-2 ("n_embd")
  • OpenELM (ffn_multipliers + num_query_heads)
  • DeepSeek (n_routed_experts)
  • Falcon-H1 (mamba_d_state + mamba_expand + falcon_h1)
  • RWKV-7 (time_mix_extra_dim)
  • MAMBA (state_size + conv_kernel, NOT num_attention_heads)
  • BERT (type_vocab_size)
  • Mistral (sliding_window + MistralForCausalLM) → before Llama
  • Whisper (WhisperForConditionalGeneration)
  • Moonshine (MoonshineForConditionalGeneration)
  • Llama (LlamaForCausalLM) — catch-all, checked LAST

Alias mechanism (unblocks 11+ blocked families)

let mut registry = build_default_registry();
registry.register_alias("codellama/*", "llama")?;
registry.register_alias("TinyLlama/*", "llama")?;
registry.register_alias("HuggingFaceTB/SmolLM-*", "llama")?;
// ...

let family = registry.resolve_alias("codellama/CodeLlama-7b-hf");
// Some("llama")

Unblocks the 11 alias entries from apr-cookbook's architecture-demos/manifest.yaml: codellama, dolphin, hermes, openchat, smollm, smollm2, tinyllama, vicuna, wizardcoder, yi, zephyr — all derive from llama or mistral and just needed pattern aliasing.

Tests

  • 9 detector_tests — one per supported family + edge cases (priority order, unknown configs, determinism)
  • 5 alias_tests — glob matching, exact matching, parent-validation, count, resolution

All 14 pass; no regressions in existing aprender-core suite.

Why upstream

apr-cookbook's PMAT-309 inference_arch_detector recipe was effectively a reverse-engineered discriminator catalog living in cookbook code. Lifting it here:

  1. Gives the catalog an authoritative home (the contracts/model-families/ upstream)
  2. Lets the cookbook recipe become a thin demo of aprender::FamilyRegistry::detect_from_config_str instead of a parallel implementation
  3. Provides the alias mechanism the cookbook tracked as upstream backlog (25 status: blocked entries in the manifest)

Follow-up PR will update apr-cookbook to consume this API.

🤖 Generated with Claude Code

Lifts the cookbook's architecture-demos detector primitive upstream as a
public FamilyRegistry method, plus adds an HF-pattern alias mechanism
that unblocks derived models (codellama→llama, etc.) without requiring
new loader implementations.

detect_from_config_str(body: &str) -> Option<&'static str>:
  Discriminator-field-based dispatch from raw config.json body to family
  name. Order-sensitive: more-specific discriminators first (qwen3_5
  before qwen3 before qwen2; mistral before llama catch-all). Recognizes
  18 families (16 in-progress text + whisper + moonshine).

register_alias(hf_pattern, parent_family) -> Result<(), String>:
  Aliases an HF repo glob pattern to an existing parent family. e.g.
  registry.register_alias("codellama/*", "llama") lets codellama
  checkpoints dispatch through the llama loader. Errors if parent_family
  is not registered.

resolve_alias(hf_repo) -> Option<&str>:
  Resolves a concrete HF repo identifier through the registered aliases.

supported_families() -> Vec<&'static str>:
  Exposes the discriminator-dispatch list (18 families) for downstream
  consumers (e.g., apr-cookbook's architecture-demos detector recipe).

This implements the upstream side of apr-cookbook's PMAT-309 detector
recipe — the cookbook's reverse-engineered discriminator catalog now has
an authoritative home in aprender-core.

Tests: 9 detector_tests + 5 alias_tests = 14 new unit tests, all pass.
No regressions in existing aprender-core test suite.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@noahgift

noahgift commented May 9, 2026

Copy link
Copy Markdown
Contributor Author

Cookbook side is consumer-ready

Just landed paiml/apr-cookbook#421 which ships the consumer-side composed pipeline that this PR's FamilyRegistry::resolve_alias + detect_from_config_str will replace once merged:

  • examples/inference/inference_arch_resolution_pipeline.rs — composes alias-resolver + detector into a single (hf_repo, body) → DetectedFamily recipe; 10 unit tests including all_alias_eligible_resolve_to_parent (the falsifier that all 16 alias-eligible blocked manifest entries map to a known parent)
  • contracts/inference-arch-resolution-pipeline-v1.yaml — Grade A 0.98 provable contract
  • lean/ProvableContracts/ArchitectureDemos/ArchResolutionPipeline.lean — real Lean proofs (no sorry)

Also shipped kani-gate CI (cookbook PR #421 + spec) that verifies all 108 declared #[kani::proof] harnesses on every PR via cargo kani. 108/108 verify in ~14s locally; one overflow bug caught and fixed during landing.

Unblock impact when this merges

Merging #1562 unblocks:

  1. 16 alias-eligible cookbook manifest entries — codellama, tinyllama, vicuna, yi, smollm, smollm2, dolphin, hermes, openchat, wizardcoder, codestral, zephyr, distilgpt2, pythia, galactica, codegemma — all flip from blocked to aliased via a single cookbook manifest edit.

  2. Cookbook detector swapinference_arch_detector::detect_from_str body becomes a thin wrapper over FamilyRegistry::detect_from_config_str. The 22 detector tests become integration tests for the upstream API. Same for inference_arch_alias_resolver::resolveFamilyRegistry::resolve_alias.

  3. Coverage bump — cookbook architecture-demos coverage jumps from 18/43 to 34/43 families certified.

Status check

The cookbook side is committed to consuming this verbatim once shipped. Happy to help land — what's the gating constraint? Test failures, review feedback, or just queue depth?

@noahgift noahgift enabled auto-merge (squash) May 11, 2026 14:52
noahgift added 24 commits May 12, 2026 09:55
@noahgift noahgift merged commit bffe701 into main May 14, 2026
10 checks passed
@noahgift noahgift deleted the feat/arch-demos-detector-upstream branch May 14, 2026 04:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant