Summary
MVP certification for Qwen/Qwen2.5-Coder-0.5B-Instruct is BLOCKED at MQS 405/1000 (Grade F). All 18 basic inference tests pass (G1-G4 ✓ across GGUF, APR, SafeTensors on CPU+GPU), but 15 out of 31 tests fail due to 4 distinct defects in the apr conversion and inference pipeline.
apr version: 0.2.12
QA framework: apr-model-qa-playbook (commit 2f376d8)
Model: Qwen/Qwen2.5-Coder-0.5B-Instruct (494M params, SafeTensors canonical source)
Certification Evidence
Scenarios: 31
Passed: 18 (all basic inference G1-G4 across 3 formats × 2 backends)
Failed: 15 (all conversion pipeline)
Pass rate: 58.1%
MQS Score: 405/1000
Grade: F
Status: BLOCKED
Throughput (6-column profiling)
| Format |
CPU (tok/s) |
GPU (tok/s) |
| GGUF |
2.9 |
252.7 |
| APR |
7.2 |
0.8 |
| SafeTensors |
0.0 |
0.0 |
Defect 1: apr rosetta convert produces files with no extension (P0)
Affects: 6 conversion tests on CPU (F-CONV-G-A, F-CONV-A-G, F-CONV-G-S, F-CONV-S-G, F-CONV-A-S, F-CONV-S-A) + 1 golden rule test (F-GOLDEN-RULE-001)
Error:
Security error: Invalid model file extension: '.'. Expected one of: gguf, safetensors, apr, bin
Root cause: When apr rosetta convert produces a target file, the output path has a bare . extension (no format suffix). Subsequent inference on the converted file fails because apr run validates file extensions and rejects . as invalid.
Reproduction:
apr rosetta convert source.gguf target.apr # produces file, but...
apr run --prompt "2+2" --max-tokens 5 target.apr # fails: extension '.'
Defect 2: apr run does not accept --gpu flag (P0)
Affects: 6 conversion tests on GPU (F-CONV-G-A, F-CONV-A-G, F-CONV-G-S, F-CONV-S-G, F-CONV-A-S, F-CONV-S-A)
Error:
error: unexpected argument '--gpu' found
tip: to pass '--gpu' as a value, use '-- --gpu'
Usage: apr run --prompt <PROMPT> --max-tokens <MAX_TOKENS> <SOURCE>
Root cause: apr run subcommand does not have a --gpu flag in its clap argument definitions, but the QA framework passes --gpu for GPU backend tests. Either apr run needs --gpu or the QA framework's conversion tests need to use a different flag (e.g., --device gpu).
Reproduction:
apr run --prompt "2+2" --max-tokens 5 --gpu model.gguf
# error: unexpected argument '--gpu' found
Defect 3: apr rosetta convert round-trip fails on extension detection (P1)
Affects: 2 round-trip tests (F-CONV-RT-001 on CPU and GPU)
Error:
Validation failed: Source inspection failed: Invalid model format: No file extension found
Root cause: The round-trip conversion chain (format A → B → A) produces intermediate files that lack proper extensions, causing the second conversion step to fail at source inspection.
Defect 4: SafeTensors→GGUF conversion crashes on tensor size validation (P1)
Not caught by certification (uses pre-converted GGUF), but blocks the provenance chain.
Error:
Validation failed: Conversion failed: Invalid model format:
Tensor 'model.layers.15.self_attn.o_proj.weight' data exceeds file size
Reproduction:
apr rosetta convert model.safetensors model.gguf
# error: Tensor 'model.layers.15.self_attn.o_proj.weight' data exceeds file size
Root cause: The GGUF writer miscalculates tensor data offsets or file size when converting from SafeTensors for this specific model architecture (Qwen2, 24 layers, 896 embed dim). Fails consistently at layer 15's o_proj.weight.
Impact
- Certification blocked for Qwen2.5-Coder-0.5B — cannot achieve PROVISIONAL (requires ≥90% pass rate, currently 58.1%)
- Defects 1 and 2 affect all models — the same conversion tests fail for all Qwen Coder sizes (1.5B, 3B, 7B, 14B, 32B are PROVISIONAL only because conversion test failures don't trigger G1-G4 gateway zeroing)
- Provenance chain broken — cannot verify SafeTensors→GGUF conversion fidelity (PROV-003)
Suggested Fixes
- Defect 1: Fix
rosetta convert output path logic to preserve target file extension
- Defect 2: Add
--gpu / --device flag to apr run subcommand
- Defect 3: Ensure intermediate files in conversion chains retain proper extensions
- Defect 4: Fix GGUF writer tensor offset calculation for Qwen2 architecture
QA Gate Reference
| Gate ID |
Description |
Status |
| F-CONV-G-A |
GGUF→APR conversion fidelity |
FAIL |
| F-CONV-A-G |
APR→GGUF conversion fidelity |
FAIL |
| F-CONV-G-S |
GGUF→SafeTensors conversion fidelity |
FAIL |
| F-CONV-S-G |
SafeTensors→GGUF conversion fidelity |
FAIL |
| F-CONV-A-S |
APR→SafeTensors conversion fidelity |
FAIL |
| F-CONV-S-A |
SafeTensors→APR conversion fidelity |
FAIL |
| F-CONV-RT-001 |
Round-trip conversion (A→B→A) |
FAIL |
| F-GOLDEN-RULE-001 |
Golden rule (original format inference) |
FAIL |
Evidence
Full evidence JSON: apr-model-qa-playbook/certifications/qwen2-5-coder-0-5b-instruct/evidence.json
Summary
MVP certification for Qwen/Qwen2.5-Coder-0.5B-Instruct is BLOCKED at MQS 405/1000 (Grade F). All 18 basic inference tests pass (G1-G4 ✓ across GGUF, APR, SafeTensors on CPU+GPU), but 15 out of 31 tests fail due to 4 distinct defects in the
aprconversion and inference pipeline.apr version: 0.2.12
QA framework: apr-model-qa-playbook (commit 2f376d8)
Model: Qwen/Qwen2.5-Coder-0.5B-Instruct (494M params, SafeTensors canonical source)
Certification Evidence
Throughput (6-column profiling)
Defect 1:
apr rosetta convertproduces files with no extension (P0)Affects: 6 conversion tests on CPU (F-CONV-G-A, F-CONV-A-G, F-CONV-G-S, F-CONV-S-G, F-CONV-A-S, F-CONV-S-A) + 1 golden rule test (F-GOLDEN-RULE-001)
Error:
Root cause: When
apr rosetta convertproduces a target file, the output path has a bare.extension (no format suffix). Subsequent inference on the converted file fails becauseapr runvalidates file extensions and rejects.as invalid.Reproduction:
Defect 2:
apr rundoes not accept--gpuflag (P0)Affects: 6 conversion tests on GPU (F-CONV-G-A, F-CONV-A-G, F-CONV-G-S, F-CONV-S-G, F-CONV-A-S, F-CONV-S-A)
Error:
Root cause:
apr runsubcommand does not have a--gpuflag in its clap argument definitions, but the QA framework passes--gpufor GPU backend tests. Eitherapr runneeds--gpuor the QA framework's conversion tests need to use a different flag (e.g.,--device gpu).Reproduction:
Defect 3:
apr rosetta convertround-trip fails on extension detection (P1)Affects: 2 round-trip tests (F-CONV-RT-001 on CPU and GPU)
Error:
Root cause: The round-trip conversion chain (format A → B → A) produces intermediate files that lack proper extensions, causing the second conversion step to fail at source inspection.
Defect 4: SafeTensors→GGUF conversion crashes on tensor size validation (P1)
Not caught by certification (uses pre-converted GGUF), but blocks the provenance chain.
Error:
Reproduction:
apr rosetta convert model.safetensors model.gguf # error: Tensor 'model.layers.15.self_attn.o_proj.weight' data exceeds file sizeRoot cause: The GGUF writer miscalculates tensor data offsets or file size when converting from SafeTensors for this specific model architecture (Qwen2, 24 layers, 896 embed dim). Fails consistently at layer 15's
o_proj.weight.Impact
Suggested Fixes
rosetta convertoutput path logic to preserve target file extension--gpu/--deviceflag toapr runsubcommandQA Gate Reference
Evidence
Full evidence JSON:
apr-model-qa-playbook/certifications/qwen2-5-coder-0-5b-instruct/evidence.json