Skip to content

Qwen2.5-Coder-0.5B MVP certification blocked: 4 conversion pipeline defects #196

@noahgift

Description

@noahgift

Summary

MVP certification for Qwen/Qwen2.5-Coder-0.5B-Instruct is BLOCKED at MQS 405/1000 (Grade F). All 18 basic inference tests pass (G1-G4 ✓ across GGUF, APR, SafeTensors on CPU+GPU), but 15 out of 31 tests fail due to 4 distinct defects in the apr conversion and inference pipeline.

apr version: 0.2.12
QA framework: apr-model-qa-playbook (commit 2f376d8)
Model: Qwen/Qwen2.5-Coder-0.5B-Instruct (494M params, SafeTensors canonical source)

Certification Evidence

Scenarios: 31
Passed:    18  (all basic inference G1-G4 across 3 formats × 2 backends)
Failed:    15  (all conversion pipeline)
Pass rate: 58.1%
MQS Score: 405/1000
Grade:     F
Status:    BLOCKED

Throughput (6-column profiling)

Format CPU (tok/s) GPU (tok/s)
GGUF 2.9 252.7
APR 7.2 0.8
SafeTensors 0.0 0.0

Defect 1: apr rosetta convert produces files with no extension (P0)

Affects: 6 conversion tests on CPU (F-CONV-G-A, F-CONV-A-G, F-CONV-G-S, F-CONV-S-G, F-CONV-A-S, F-CONV-S-A) + 1 golden rule test (F-GOLDEN-RULE-001)

Error:

Security error: Invalid model file extension: '.'. Expected one of: gguf, safetensors, apr, bin

Root cause: When apr rosetta convert produces a target file, the output path has a bare . extension (no format suffix). Subsequent inference on the converted file fails because apr run validates file extensions and rejects . as invalid.

Reproduction:

apr rosetta convert source.gguf target.apr   # produces file, but...
apr run --prompt "2+2" --max-tokens 5 target.apr  # fails: extension '.'

Defect 2: apr run does not accept --gpu flag (P0)

Affects: 6 conversion tests on GPU (F-CONV-G-A, F-CONV-A-G, F-CONV-G-S, F-CONV-S-G, F-CONV-A-S, F-CONV-S-A)

Error:

error: unexpected argument '--gpu' found

  tip: to pass '--gpu' as a value, use '-- --gpu'

Usage: apr run --prompt <PROMPT> --max-tokens <MAX_TOKENS> <SOURCE>

Root cause: apr run subcommand does not have a --gpu flag in its clap argument definitions, but the QA framework passes --gpu for GPU backend tests. Either apr run needs --gpu or the QA framework's conversion tests need to use a different flag (e.g., --device gpu).

Reproduction:

apr run --prompt "2+2" --max-tokens 5 --gpu model.gguf
# error: unexpected argument '--gpu' found

Defect 3: apr rosetta convert round-trip fails on extension detection (P1)

Affects: 2 round-trip tests (F-CONV-RT-001 on CPU and GPU)

Error:

Validation failed: Source inspection failed: Invalid model format: No file extension found

Root cause: The round-trip conversion chain (format A → B → A) produces intermediate files that lack proper extensions, causing the second conversion step to fail at source inspection.

Defect 4: SafeTensors→GGUF conversion crashes on tensor size validation (P1)

Not caught by certification (uses pre-converted GGUF), but blocks the provenance chain.

Error:

Validation failed: Conversion failed: Invalid model format:
Tensor 'model.layers.15.self_attn.o_proj.weight' data exceeds file size

Reproduction:

apr rosetta convert model.safetensors model.gguf
# error: Tensor 'model.layers.15.self_attn.o_proj.weight' data exceeds file size

Root cause: The GGUF writer miscalculates tensor data offsets or file size when converting from SafeTensors for this specific model architecture (Qwen2, 24 layers, 896 embed dim). Fails consistently at layer 15's o_proj.weight.

Impact

  • Certification blocked for Qwen2.5-Coder-0.5B — cannot achieve PROVISIONAL (requires ≥90% pass rate, currently 58.1%)
  • Defects 1 and 2 affect all models — the same conversion tests fail for all Qwen Coder sizes (1.5B, 3B, 7B, 14B, 32B are PROVISIONAL only because conversion test failures don't trigger G1-G4 gateway zeroing)
  • Provenance chain broken — cannot verify SafeTensors→GGUF conversion fidelity (PROV-003)

Suggested Fixes

  1. Defect 1: Fix rosetta convert output path logic to preserve target file extension
  2. Defect 2: Add --gpu / --device flag to apr run subcommand
  3. Defect 3: Ensure intermediate files in conversion chains retain proper extensions
  4. Defect 4: Fix GGUF writer tensor offset calculation for Qwen2 architecture

QA Gate Reference

Gate ID Description Status
F-CONV-G-A GGUF→APR conversion fidelity FAIL
F-CONV-A-G APR→GGUF conversion fidelity FAIL
F-CONV-G-S GGUF→SafeTensors conversion fidelity FAIL
F-CONV-S-G SafeTensors→GGUF conversion fidelity FAIL
F-CONV-A-S APR→SafeTensors conversion fidelity FAIL
F-CONV-S-A SafeTensors→APR conversion fidelity FAIL
F-CONV-RT-001 Round-trip conversion (A→B→A) FAIL
F-GOLDEN-RULE-001 Golden rule (original format inference) FAIL

Evidence

Full evidence JSON: apr-model-qa-playbook/certifications/qwen2-5-coder-0-5b-instruct/evidence.json

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions