fix(export): apr export no longer panics on missing num_layers (closes #1865) by noahgift · Pull Request #1868 · paiml/aprender

noahgift · 2026-05-22T07:03:07Z

Summary

Fixes #1865 — apr export <model>.apr --format gguf panicked with C-07: num_layers required for GGUF export (exit 101) on any APR file that did not carry num_layers in metadata. Older APR files (and any produced without apr stamp --num-layers) leave the field as None.

Fix

1. Inference fallback (`metadata.rs`)

infer_num_layers_from_tensor_names() derives the layer count from blk.N.* (GGUF convention) or model.layers.N.* (HF convention) tensor names. Every APR file carries this information in its tensor layout; the exporter just wasn't consulting it. export_apr_to_gguf_raw patches the metadata with the inferred value before the dim-required check fires.

2. No-panic guarantee (`metadata.rs`)

build_gguf_arch_metadata returns Result<Vec<...>, AprenderError> instead of panicking via .expect(). Both call sites use ?. Missing dims produce a clean CliError::ValidationFailed::FormatError:

error: Validation failed: Invalid model format: C-07: hidden_size required
for GGUF export (missing in APR metadata). Re-stamp the APR file with
`apr stamp` populating model dimensions, or convert from the original
GGUF/SafeTensors source.

Exit code 5 (ValidationFailed), not 101 (panic).

Contract

New contracts/apr-export-num-layers-v1.yaml:

equation num_layers_inference: max(blk.N) + 1
equation export_no_panic: all dim lookups go through ok_or_else
FALSIFY-EXPORT-NUM-LAYERS-001: inference correctness (PASS)
FALSIFY-EXPORT-NUM-LAYERS-002: no .expect() leakage in metadata.rs (PASS)
FALSIFY-EXPORT-NUM-LAYERS-003: build_gguf_arch_metadata -> Result<...> (PASS)

Tests

4 unit tests for infer_num_layers_from_tensor_names (blk.N, model.layers.N, no-block, malformed)
1 integration test test_export_apr_to_gguf_raw_infers_num_layers_when_missing
3 #[should_panic] tests converted to expect_err assertions
6 happy-path tests updated for Result return type

All 14 affected tests pass.

End-to-end verification

$ apr export /home/noah/models/qwen2.5-coder-1.5b-instruct-q4k.apr \
    --format gguf -o /tmp/rt.gguf
[PMAT-252] Raw passthrough: detected Q4K in APR source.
[#1865] num_layers missing from APR metadata — inferred 28 from blk.N.* tensor names
error: Validation failed: Invalid model format: C-07: num_heads required ...
$ echo $?
5

Exit 5 (clean error), not 101 (panic). The 1.5B file also lacks num_heads and hidden_size, which now error cleanly with recovery advice instead of aborting.

Test plan

4 inference unit tests pass
Integration test (inference path) passes
Converted #[should_panic] tests pass as expect_err
FALSIFY-EXPORT-NUM-LAYERS-{001,002,003} all PASS
Contract YAML parses
Real reproducer no longer panics (exit 5, not 101)
CI: workspace-test, fmt, contracts, deny

🤖 Generated with Claude Code

…#1865) `apr export <model>.apr --format gguf` panicked at metadata.rs:384 with `thread 'main' panicked at .. C-07: num_layers required for GGUF export` (exit 101) whenever the APR file did not carry `num_layers` in metadata. Older APR files (and any produced without `apr stamp --num-layers`) leave the field as `None`, so the canonical publish pipeline aborted on otherwise valid inputs. Fix: two layers. 1. **Inference fallback** — `infer_num_layers_from_tensor_names()` derives the layer count from `blk.N.*` / `model.layers.N.*` tensor names. Every APR file always carries this information in its tensor layout; the exporter just wasn't consulting it. `export_apr_to_gguf_raw` now patches the metadata with the inferred value before the dimension-required check fires. 2. **No-panic guarantee** — `build_gguf_arch_metadata` returns `Result<Vec<...>, AprenderError>` instead of panicking via `.expect()`. Both call sites use `?` to propagate. Missing dims now produce a clean `CliError::ValidationFailed::FormatError` with an actionable message: error: Validation failed: Invalid model format: C-07: hidden_size required for GGUF export (missing in APR metadata). Re-stamp the APR file with `apr stamp` populating model dimensions, or convert from the original GGUF/SafeTensors source. Exit code 5 (ValidationFailed), not 101 (panic). Contract: new `contracts/apr-export-num-layers-v1.yaml` with - equation `num_layers_inference`: max(blk.N) + 1 - equation `export_no_panic`: all dim lookups go through ok_or_else - 3 falsifiers (inference correctness, no .expect() leakage, Result return type) Tests: - 4 unit tests for `infer_num_layers_from_tensor_names` (blk.N, model.layers.N, none-pattern, malformed-index) - 1 integration test `test_export_apr_to_gguf_raw_infers_num_layers_when_missing` — builds APR with no num_layers metadata, verifies export succeeds via inference - 3 `#[should_panic]` tests converted to `expect_err` assertions - 6 happy-path tests updated for new `Result` return type Verified end-to-end: $ apr export /home/noah/models/qwen2.5-coder-1.5b-instruct-q4k.apr \ --format gguf -o /tmp/rt.gguf [PMAT-252] Raw passthrough: detected Q4K in APR source. [#1865] num_layers missing from APR metadata — inferred 28 from blk.N.* tensor names error: Validation failed: Invalid model format: C-07: num_heads required ... (exit 5, not 101 — the panic is gone) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…itstream-io) (#1878) `cargo deny check advisories` started failing on every PR (and on main) 2026-05-22 with: error[unmaintained]: core2 is unmaintained, all versions yanked ├ ID: RUSTSEC-2026-0105 ├ Advisory: https://rustsec.org/advisories/RUSTSEC-2026-0105 The dep is pulled in transitively via `bitstream-io` (image/media decoding stack — `cargo tree` shows `bitstream-io v4.9.0 → core2 v0.4.0`). No first-party use; no drop-in replacement until upstream `bitstream-io` migrates off core2. This commit unblocks the in-flight PR cascade (#1867 #1868 #1870 #1873 #1875 #1876) which all failed CI's `ci / lint` step on this advisory. The deny entry is structured per the existing pattern in this file (id + human reason mentioning the transitive path) so revisiting the ignore in 6-12 months is straightforward. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

…ly cron (#1875) Adds an end-to-end "Qwen story" that exercises every core apr command group against the Qwen scale ladder (0.5B → 1.5B → 7B → 30B-MoE). The story is the single canonical demo in README.md AND a regression gate via runnable script + falsification contract + nightly cron. ## Beats 1. **Discover** (Registry) — pull, list 2. **Trust** (QA) — qa, validate, lint 3. **Explore** (Inspection) — inspect, tensors, tree 4. **Adapt** (Model ops) — export, diff, convert/quantize 5. **Use** (Inference) — run, chat, code 6. **Serve** (REST) — serve run + curl /v1/chat/completions OpenAI-compat 7. **Operate** (Profiling) — profile, gpu, serve plan (7B Q4K GGUF) 8. **Scale** (MoE) — inspect, tensors on 30B-MoE qwen3moe ## Pmat bug-hunt layer When run with `PMAT_HUNT=1` (default), each beat emits a structured manifest of high-risk untested code in the command-handler modules it just exercised: -- pmat bug-hunt manifest (run chat code) -- gap crates/apr-cli/src/commands/run.rs:resolve_model_alias (impact=42.3) churn crates/apr-cli/src/commands/code.rs:dispatch_agent (commits=11) fault crates/aprender-serve/src/api/cuda_chat_backend.rs:try_qwen3_moe (unwrap) The nightly cron uploads this manifest as an artifact, compares against the previous successful run, and opens (or comments on) a tracking issue when growth exceeds 5 lines — so untested branches in command handlers can't accumulate quietly. ## Files - `scripts/qwen-story.sh` (336 LOC) — runnable story with proper exit-code capture (`OUT=$(cmd); EC=$?` everywhere; no pipe-then-`$?` per memory rule) - `contracts/qwen-story-v1.yaml` — 3 equations + 8 falsifiers, all PASS locally (script exists+executable, 8 beats, run_cmd helper, pmat_hunt per beat, README link, daily cron file, bashrs clean, Beat 7 skips `apr qa` on 7B Q4K due to #1864) - `README.md` — new `## A Qwen story` section replacing the flat `## CLI examples` block. Fixes two README bugs surfaced during dogfood: `apr profile --roofline` (no such flag; just `apr profile <file>`) and `apr bench --assert-tps` (flag is on `apr qa`, not `bench`). - `.github/workflows/qwen-story-daily.yml` — self-hosted GPU runner, 04:17 UTC cron + workflow_dispatch, uploads pmat manifest + story log artifacts, files tracking issue when story regresses or manifest grows. ## Verification $ bash scripts/qwen-story.sh # local smoke -- Beat 1: Discover (Registry) -- ✓ PASS B1 list -- Beat 2: Trust (QA gates) -- ✓ PASS B2 apr qa ✗ FAIL B2 apr validate --quality - exit=5 (after #1866 fix this should be 0) -- Beat 3: Explore (Inspection) -- ✓ PASS B3 apr inspect --json (arch=qwen2) ✓ PASS B3 apr tensors --json (339 tensors) ✓ PASS B3 apr tree -- Beat 4: Adapt (Model ops) -- ✗ FAIL B4 apr export - PANIC (exit=101) - #1865 regression -- Beat 5: Use (Inference) -- ✓ PASS B5 apr run (Rust code completion) ✓ PASS B5 apr code -p -- Beat 6: Serve (REST API) -- ✓ PASS B6 apr serve run (port=22915) ✓ PASS B6 /v1/chat/completions (got OK...) -- Beat 7: Operate (Profiling) -- ✓ PASS B7 apr profile ✓ PASS B7 apr gpu --json ✓ PASS B7 apr serve plan -- 7B VRAM budget -- Beat 8: Scale (MoE introspection) -- ✓ PASS B8 apr inspect --json (arch=qwen3moe) ✓ PASS B8 apr tensors --json (579 tensors) 14 PASS / 2 FAIL / 0 SKIP The 2 FAILs are EXPECTED until the in-flight fixes land: - B2 validate --quality: closed by #1870 - B4 export panic: closed by #1868 Once those PRs merge, this story will be 16 PASS / 0 FAIL / 0 SKIP on a host with all 4 Qwen models cached. ## Follow-up A separate PR will add `/dogfood` Gate 18 that invokes this script (kept separate to avoid conflict with PR #1872 which is already adding Gates 13-17 to the dogfood skill). Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

noahgift enabled auto-merge (squash) May 22, 2026 07:03

noahgift added 2 commits May 22, 2026 09:18

Merge branch 'main' into fix/apr-export-num-layers-1865

46282de

Merge branch 'main' into fix/apr-export-num-layers-1865

3649da0

noahgift mentioned this pull request May 22, 2026

feat(qwen-story): 8-beat E2E narrative + pmat bug-hunt + daily cron #1875

Merged

7 tasks

Merge branch 'main' into fix/apr-export-num-layers-1865

750c86b

noahgift mentioned this pull request May 22, 2026

chore(deny): ignore RUSTSEC-2026-0105 (core2 yanked, transitive via bitstream-io) #1878

Merged

3 tasks

noahgift added 2 commits May 22, 2026 11:46

Merge branch 'main' into fix/apr-export-num-layers-1865

0fc6780

Merge branch 'main' into fix/apr-export-num-layers-1865

baed97a

noahgift merged commit 470617a into main May 22, 2026
10 checks passed

noahgift deleted the fix/apr-export-num-layers-1865 branch May 22, 2026 10:53

This was referenced May 22, 2026

spec(SPEC-CUBLAS-FP8-7B-FIX-001): epic to root-cause cuBLAS FP8 7B gibberish (holds v0.35.0) #1882

Closed

Qwen2.5-7B Q4_K GPU inference produces gibberish — 'ampiezza' (wgpu) / '<|im_start|>' (cuBLAS) — regression vs #374 / #559 #1864

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(export): apr export no longer panics on missing num_layers (closes #1865)#1868

fix(export): apr export no longer panics on missing num_layers (closes #1865)#1868
noahgift merged 6 commits into
mainfrom
fix/apr-export-num-layers-1865

noahgift commented May 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

noahgift commented May 22, 2026

Summary

Fix

1. Inference fallback (metadata.rs)

2. No-panic guarantee (metadata.rs)

Contract

Tests

End-to-end verification

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. Inference fallback (`metadata.rs`)

2. No-panic guarantee (`metadata.rs`)