fix(export): apr export no longer panics on missing num_layers (closes #1865)#1868
Merged
Conversation
…#1865) `apr export <model>.apr --format gguf` panicked at metadata.rs:384 with `thread 'main' panicked at .. C-07: num_layers required for GGUF export` (exit 101) whenever the APR file did not carry `num_layers` in metadata. Older APR files (and any produced without `apr stamp --num-layers`) leave the field as `None`, so the canonical publish pipeline aborted on otherwise valid inputs. Fix: two layers. 1. **Inference fallback** — `infer_num_layers_from_tensor_names()` derives the layer count from `blk.N.*` / `model.layers.N.*` tensor names. Every APR file always carries this information in its tensor layout; the exporter just wasn't consulting it. `export_apr_to_gguf_raw` now patches the metadata with the inferred value before the dimension-required check fires. 2. **No-panic guarantee** — `build_gguf_arch_metadata` returns `Result<Vec<...>, AprenderError>` instead of panicking via `.expect()`. Both call sites use `?` to propagate. Missing dims now produce a clean `CliError::ValidationFailed::FormatError` with an actionable message: error: Validation failed: Invalid model format: C-07: hidden_size required for GGUF export (missing in APR metadata). Re-stamp the APR file with `apr stamp` populating model dimensions, or convert from the original GGUF/SafeTensors source. Exit code 5 (ValidationFailed), not 101 (panic). Contract: new `contracts/apr-export-num-layers-v1.yaml` with - equation `num_layers_inference`: max(blk.N) + 1 - equation `export_no_panic`: all dim lookups go through ok_or_else - 3 falsifiers (inference correctness, no .expect() leakage, Result return type) Tests: - 4 unit tests for `infer_num_layers_from_tensor_names` (blk.N, model.layers.N, none-pattern, malformed-index) - 1 integration test `test_export_apr_to_gguf_raw_infers_num_layers_when_missing` — builds APR with no num_layers metadata, verifies export succeeds via inference - 3 `#[should_panic]` tests converted to `expect_err` assertions - 6 happy-path tests updated for new `Result` return type Verified end-to-end: $ apr export /home/noah/models/qwen2.5-coder-1.5b-instruct-q4k.apr \ --format gguf -o /tmp/rt.gguf [PMAT-252] Raw passthrough: detected Q4K in APR source. [#1865] num_layers missing from APR metadata — inferred 28 from blk.N.* tensor names error: Validation failed: Invalid model format: C-07: num_heads required ... (exit 5, not 101 — the panic is gone) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
7 tasks
3 tasks
noahgift
added a commit
that referenced
this pull request
May 22, 2026
…itstream-io) (#1878) `cargo deny check advisories` started failing on every PR (and on main) 2026-05-22 with: error[unmaintained]: core2 is unmaintained, all versions yanked ├ ID: RUSTSEC-2026-0105 ├ Advisory: https://rustsec.org/advisories/RUSTSEC-2026-0105 The dep is pulled in transitively via `bitstream-io` (image/media decoding stack — `cargo tree` shows `bitstream-io v4.9.0 → core2 v0.4.0`). No first-party use; no drop-in replacement until upstream `bitstream-io` migrates off core2. This commit unblocks the in-flight PR cascade (#1867 #1868 #1870 #1873 #1875 #1876) which all failed CI's `ci / lint` step on this advisory. The deny entry is structured per the existing pattern in this file (id + human reason mentioning the transitive path) so revisiting the ignore in 6-12 months is straightforward. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
noahgift
added a commit
that referenced
this pull request
May 22, 2026
…ly cron (#1875) Adds an end-to-end "Qwen story" that exercises every core apr command group against the Qwen scale ladder (0.5B → 1.5B → 7B → 30B-MoE). The story is the single canonical demo in README.md AND a regression gate via runnable script + falsification contract + nightly cron. ## Beats 1. **Discover** (Registry) — pull, list 2. **Trust** (QA) — qa, validate, lint 3. **Explore** (Inspection) — inspect, tensors, tree 4. **Adapt** (Model ops) — export, diff, convert/quantize 5. **Use** (Inference) — run, chat, code 6. **Serve** (REST) — serve run + curl /v1/chat/completions OpenAI-compat 7. **Operate** (Profiling) — profile, gpu, serve plan (7B Q4K GGUF) 8. **Scale** (MoE) — inspect, tensors on 30B-MoE qwen3moe ## Pmat bug-hunt layer When run with `PMAT_HUNT=1` (default), each beat emits a structured manifest of high-risk untested code in the command-handler modules it just exercised: -- pmat bug-hunt manifest (run chat code) -- gap crates/apr-cli/src/commands/run.rs:resolve_model_alias (impact=42.3) churn crates/apr-cli/src/commands/code.rs:dispatch_agent (commits=11) fault crates/aprender-serve/src/api/cuda_chat_backend.rs:try_qwen3_moe (unwrap) The nightly cron uploads this manifest as an artifact, compares against the previous successful run, and opens (or comments on) a tracking issue when growth exceeds 5 lines — so untested branches in command handlers can't accumulate quietly. ## Files - `scripts/qwen-story.sh` (336 LOC) — runnable story with proper exit-code capture (`OUT=$(cmd); EC=$?` everywhere; no pipe-then-`$?` per memory rule) - `contracts/qwen-story-v1.yaml` — 3 equations + 8 falsifiers, all PASS locally (script exists+executable, 8 beats, run_cmd helper, pmat_hunt per beat, README link, daily cron file, bashrs clean, Beat 7 skips `apr qa` on 7B Q4K due to #1864) - `README.md` — new `## A Qwen story` section replacing the flat `## CLI examples` block. Fixes two README bugs surfaced during dogfood: `apr profile --roofline` (no such flag; just `apr profile <file>`) and `apr bench --assert-tps` (flag is on `apr qa`, not `bench`). - `.github/workflows/qwen-story-daily.yml` — self-hosted GPU runner, 04:17 UTC cron + workflow_dispatch, uploads pmat manifest + story log artifacts, files tracking issue when story regresses or manifest grows. ## Verification $ bash scripts/qwen-story.sh # local smoke -- Beat 1: Discover (Registry) -- ✓ PASS B1 list -- Beat 2: Trust (QA gates) -- ✓ PASS B2 apr qa ✗ FAIL B2 apr validate --quality - exit=5 (after #1866 fix this should be 0) -- Beat 3: Explore (Inspection) -- ✓ PASS B3 apr inspect --json (arch=qwen2) ✓ PASS B3 apr tensors --json (339 tensors) ✓ PASS B3 apr tree -- Beat 4: Adapt (Model ops) -- ✗ FAIL B4 apr export - PANIC (exit=101) - #1865 regression -- Beat 5: Use (Inference) -- ✓ PASS B5 apr run (Rust code completion) ✓ PASS B5 apr code -p -- Beat 6: Serve (REST API) -- ✓ PASS B6 apr serve run (port=22915) ✓ PASS B6 /v1/chat/completions (got OK...) -- Beat 7: Operate (Profiling) -- ✓ PASS B7 apr profile ✓ PASS B7 apr gpu --json ✓ PASS B7 apr serve plan -- 7B VRAM budget -- Beat 8: Scale (MoE introspection) -- ✓ PASS B8 apr inspect --json (arch=qwen3moe) ✓ PASS B8 apr tensors --json (579 tensors) 14 PASS / 2 FAIL / 0 SKIP The 2 FAILs are EXPECTED until the in-flight fixes land: - B2 validate --quality: closed by #1870 - B4 export panic: closed by #1868 Once those PRs merge, this story will be 16 PASS / 0 FAIL / 0 SKIP on a host with all 4 Qwen models cached. ## Follow-up A separate PR will add `/dogfood` Gate 18 that invokes this script (kept separate to avoid conflict with PR #1872 which is already adding Gates 13-17 to the dogfood skill). Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #1865 —
apr export <model>.apr --format ggufpanicked withC-07: num_layers required for GGUF export(exit 101) on any APR file that did not carrynum_layersin metadata. Older APR files (and any produced withoutapr stamp --num-layers) leave the field asNone.Fix
1. Inference fallback (
metadata.rs)infer_num_layers_from_tensor_names()derives the layer count fromblk.N.*(GGUF convention) ormodel.layers.N.*(HF convention) tensor names. Every APR file carries this information in its tensor layout; the exporter just wasn't consulting it.export_apr_to_gguf_rawpatches the metadata with the inferred value before the dim-required check fires.2. No-panic guarantee (
metadata.rs)build_gguf_arch_metadatareturnsResult<Vec<...>, AprenderError>instead of panicking via.expect(). Both call sites use?. Missing dims produce a cleanCliError::ValidationFailed::FormatError:Exit code 5 (ValidationFailed), not 101 (panic).
Contract
New
contracts/apr-export-num-layers-v1.yaml:num_layers_inference:max(blk.N) + 1export_no_panic: all dim lookups go throughok_or_else.expect()leakage in metadata.rs (PASS)build_gguf_arch_metadata -> Result<...>(PASS)Tests
infer_num_layers_from_tensor_names(blk.N, model.layers.N, no-block, malformed)test_export_apr_to_gguf_raw_infers_num_layers_when_missing#[should_panic]tests converted toexpect_errassertionsResultreturn typeAll 14 affected tests pass.
End-to-end verification
Exit 5 (clean error), not 101 (panic). The 1.5B file also lacks
num_headsandhidden_size, which now error cleanly with recovery advice instead of aborting.Test plan
#[should_panic]tests pass asexpect_err🤖 Generated with Claude Code