Skip to content

fix(export): apr export no longer panics on missing num_layers (closes #1865)#1868

Merged
noahgift merged 6 commits into
mainfrom
fix/apr-export-num-layers-1865
May 22, 2026
Merged

fix(export): apr export no longer panics on missing num_layers (closes #1865)#1868
noahgift merged 6 commits into
mainfrom
fix/apr-export-num-layers-1865

Conversation

@noahgift

Copy link
Copy Markdown
Contributor

Summary

Fixes #1865apr export <model>.apr --format gguf panicked with C-07: num_layers required for GGUF export (exit 101) on any APR file that did not carry num_layers in metadata. Older APR files (and any produced without apr stamp --num-layers) leave the field as None.

Fix

1. Inference fallback (metadata.rs)

infer_num_layers_from_tensor_names() derives the layer count from blk.N.* (GGUF convention) or model.layers.N.* (HF convention) tensor names. Every APR file carries this information in its tensor layout; the exporter just wasn't consulting it. export_apr_to_gguf_raw patches the metadata with the inferred value before the dim-required check fires.

2. No-panic guarantee (metadata.rs)

build_gguf_arch_metadata returns Result<Vec<...>, AprenderError> instead of panicking via .expect(). Both call sites use ?. Missing dims produce a clean CliError::ValidationFailed::FormatError:

error: Validation failed: Invalid model format: C-07: hidden_size required
for GGUF export (missing in APR metadata). Re-stamp the APR file with
`apr stamp` populating model dimensions, or convert from the original
GGUF/SafeTensors source.

Exit code 5 (ValidationFailed), not 101 (panic).

Contract

New contracts/apr-export-num-layers-v1.yaml:

  • equation num_layers_inference: max(blk.N) + 1
  • equation export_no_panic: all dim lookups go through ok_or_else
  • FALSIFY-EXPORT-NUM-LAYERS-001: inference correctness (PASS)
  • FALSIFY-EXPORT-NUM-LAYERS-002: no .expect() leakage in metadata.rs (PASS)
  • FALSIFY-EXPORT-NUM-LAYERS-003: build_gguf_arch_metadata -> Result<...> (PASS)

Tests

  • 4 unit tests for infer_num_layers_from_tensor_names (blk.N, model.layers.N, no-block, malformed)
  • 1 integration test test_export_apr_to_gguf_raw_infers_num_layers_when_missing
  • 3 #[should_panic] tests converted to expect_err assertions
  • 6 happy-path tests updated for Result return type

All 14 affected tests pass.

End-to-end verification

$ apr export /home/noah/models/qwen2.5-coder-1.5b-instruct-q4k.apr \
    --format gguf -o /tmp/rt.gguf
[PMAT-252] Raw passthrough: detected Q4K in APR source.
[#1865] num_layers missing from APR metadata — inferred 28 from blk.N.* tensor names
error: Validation failed: Invalid model format: C-07: num_heads required ...
$ echo $?
5

Exit 5 (clean error), not 101 (panic). The 1.5B file also lacks num_heads and hidden_size, which now error cleanly with recovery advice instead of aborting.

Test plan

  • 4 inference unit tests pass
  • Integration test (inference path) passes
  • Converted #[should_panic] tests pass as expect_err
  • FALSIFY-EXPORT-NUM-LAYERS-{001,002,003} all PASS
  • Contract YAML parses
  • Real reproducer no longer panics (exit 5, not 101)
  • CI: workspace-test, fmt, contracts, deny

🤖 Generated with Claude Code

…#1865)

`apr export <model>.apr --format gguf` panicked at metadata.rs:384 with
`thread 'main' panicked at .. C-07: num_layers required for GGUF export`
(exit 101) whenever the APR file did not carry `num_layers` in metadata.
Older APR files (and any produced without `apr stamp --num-layers`) leave
the field as `None`, so the canonical publish pipeline aborted on otherwise
valid inputs.

Fix: two layers.

1. **Inference fallback** — `infer_num_layers_from_tensor_names()` derives
   the layer count from `blk.N.*` / `model.layers.N.*` tensor names. Every
   APR file always carries this information in its tensor layout; the
   exporter just wasn't consulting it. `export_apr_to_gguf_raw` now patches
   the metadata with the inferred value before the dimension-required
   check fires.

2. **No-panic guarantee** — `build_gguf_arch_metadata` returns
   `Result<Vec<...>, AprenderError>` instead of panicking via `.expect()`.
   Both call sites use `?` to propagate. Missing dims now produce a clean
   `CliError::ValidationFailed::FormatError` with an actionable message:

       error: Validation failed: Invalid model format: C-07: hidden_size
       required for GGUF export (missing in APR metadata). Re-stamp the
       APR file with `apr stamp` populating model dimensions, or convert
       from the original GGUF/SafeTensors source.

   Exit code 5 (ValidationFailed), not 101 (panic).

Contract: new `contracts/apr-export-num-layers-v1.yaml` with
- equation `num_layers_inference`: max(blk.N) + 1
- equation `export_no_panic`: all dim lookups go through ok_or_else
- 3 falsifiers (inference correctness, no .expect() leakage, Result return type)

Tests:
- 4 unit tests for `infer_num_layers_from_tensor_names` (blk.N, model.layers.N,
  none-pattern, malformed-index)
- 1 integration test `test_export_apr_to_gguf_raw_infers_num_layers_when_missing`
  — builds APR with no num_layers metadata, verifies export succeeds via inference
- 3 `#[should_panic]` tests converted to `expect_err` assertions
- 6 happy-path tests updated for new `Result` return type

Verified end-to-end:
    $ apr export /home/noah/models/qwen2.5-coder-1.5b-instruct-q4k.apr \
        --format gguf -o /tmp/rt.gguf
    [PMAT-252] Raw passthrough: detected Q4K in APR source.
    [#1865] num_layers missing from APR metadata — inferred 28 from blk.N.* tensor names
    error: Validation failed: Invalid model format: C-07: num_heads required ...
    (exit 5, not 101 — the panic is gone)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@noahgift noahgift enabled auto-merge (squash) May 22, 2026 07:03
noahgift added a commit that referenced this pull request May 22, 2026
…itstream-io) (#1878)

`cargo deny check advisories` started failing on every PR (and on main)
2026-05-22 with:

    error[unmaintained]: core2 is unmaintained, all versions yanked
    ├ ID: RUSTSEC-2026-0105
    ├ Advisory: https://rustsec.org/advisories/RUSTSEC-2026-0105

The dep is pulled in transitively via `bitstream-io` (image/media decoding
stack — `cargo tree` shows `bitstream-io v4.9.0 → core2 v0.4.0`). No
first-party use; no drop-in replacement until upstream `bitstream-io`
migrates off core2.

This commit unblocks the in-flight PR cascade (#1867 #1868 #1870 #1873
#1875 #1876) which all failed CI's `ci / lint` step on this advisory.
The deny entry is structured per the existing pattern in this file (id +
human reason mentioning the transitive path) so revisiting the ignore in
6-12 months is straightforward.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
@noahgift noahgift merged commit 470617a into main May 22, 2026
10 checks passed
@noahgift noahgift deleted the fix/apr-export-num-layers-1865 branch May 22, 2026 10:53
noahgift added a commit that referenced this pull request May 22, 2026
…ly cron (#1875)

Adds an end-to-end "Qwen story" that exercises every core apr command
group against the Qwen scale ladder (0.5B → 1.5B → 7B → 30B-MoE). The
story is the single canonical demo in README.md AND a regression gate
via runnable script + falsification contract + nightly cron.

## Beats

1. **Discover** (Registry) — pull, list
2. **Trust** (QA) — qa, validate, lint
3. **Explore** (Inspection) — inspect, tensors, tree
4. **Adapt** (Model ops) — export, diff, convert/quantize
5. **Use** (Inference) — run, chat, code
6. **Serve** (REST) — serve run + curl /v1/chat/completions OpenAI-compat
7. **Operate** (Profiling) — profile, gpu, serve plan (7B Q4K GGUF)
8. **Scale** (MoE) — inspect, tensors on 30B-MoE qwen3moe

## Pmat bug-hunt layer

When run with `PMAT_HUNT=1` (default), each beat emits a structured
manifest of high-risk untested code in the command-handler modules it
just exercised:

    -- pmat bug-hunt manifest (run chat code) --
        gap   crates/apr-cli/src/commands/run.rs:resolve_model_alias (impact=42.3)
        churn crates/apr-cli/src/commands/code.rs:dispatch_agent (commits=11)
        fault crates/aprender-serve/src/api/cuda_chat_backend.rs:try_qwen3_moe (unwrap)

The nightly cron uploads this manifest as an artifact, compares against
the previous successful run, and opens (or comments on) a tracking issue
when growth exceeds 5 lines — so untested branches in command handlers
can't accumulate quietly.

## Files

- `scripts/qwen-story.sh` (336 LOC) — runnable story with proper exit-code
  capture (`OUT=$(cmd); EC=$?` everywhere; no pipe-then-`$?` per memory rule)
- `contracts/qwen-story-v1.yaml` — 3 equations + 8 falsifiers, all PASS
  locally (script exists+executable, 8 beats, run_cmd helper, pmat_hunt
  per beat, README link, daily cron file, bashrs clean, Beat 7 skips
  `apr qa` on 7B Q4K due to #1864)
- `README.md` — new `## A Qwen story` section replacing the flat
  `## CLI examples` block. Fixes two README bugs surfaced during dogfood:
  `apr profile --roofline` (no such flag; just `apr profile <file>`)
  and `apr bench --assert-tps` (flag is on `apr qa`, not `bench`).
- `.github/workflows/qwen-story-daily.yml` — self-hosted GPU runner,
  04:17 UTC cron + workflow_dispatch, uploads pmat manifest + story log
  artifacts, files tracking issue when story regresses or manifest grows.

## Verification

    $ bash scripts/qwen-story.sh   # local smoke
    -- Beat 1: Discover (Registry) --
    ✓ PASS  B1 list
    -- Beat 2: Trust (QA gates) --
    ✓ PASS  B2 apr qa
    ✗ FAIL  B2 apr validate --quality  -  exit=5 (after #1866 fix this should be 0)
    -- Beat 3: Explore (Inspection) --
    ✓ PASS  B3 apr inspect --json (arch=qwen2)
    ✓ PASS  B3 apr tensors --json (339 tensors)
    ✓ PASS  B3 apr tree
    -- Beat 4: Adapt (Model ops) --
    ✗ FAIL  B4 apr export  -  PANIC (exit=101)  -  #1865 regression
    -- Beat 5: Use (Inference) --
    ✓ PASS  B5 apr run (Rust code completion)
    ✓ PASS  B5 apr code -p
    -- Beat 6: Serve (REST API) --
    ✓ PASS  B6 apr serve run (port=22915)
    ✓ PASS  B6 /v1/chat/completions (got OK...)
    -- Beat 7: Operate (Profiling) --
    ✓ PASS  B7 apr profile
    ✓ PASS  B7 apr gpu --json
    ✓ PASS  B7 apr serve plan -- 7B VRAM budget
    -- Beat 8: Scale (MoE introspection) --
    ✓ PASS  B8 apr inspect --json (arch=qwen3moe)
    ✓ PASS  B8 apr tensors --json (579 tensors)
    14 PASS / 2 FAIL / 0 SKIP

The 2 FAILs are EXPECTED until the in-flight fixes land:
- B2 validate --quality: closed by #1870
- B4 export panic: closed by #1868

Once those PRs merge, this story will be 16 PASS / 0 FAIL / 0 SKIP on a
host with all 4 Qwen models cached.

## Follow-up

A separate PR will add `/dogfood` Gate 18 that invokes this script (kept
separate to avoid conflict with PR #1872 which is already adding Gates
13-17 to the dogfood skill).

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

apr export <model>.apr --format gguf panics: 'C-07: num_layers required for GGUF export'

1 participant