Qwen2.5-Coder-0.5B MVP certification blocked: 4 conversion pipeline defects

## Summary

MVP certification for **Qwen/Qwen2.5-Coder-0.5B-Instruct** is BLOCKED at MQS 405/1000 (Grade F). All 18 basic inference tests pass (G1-G4 ✓ across GGUF, APR, SafeTensors on CPU+GPU), but **15 out of 31 tests fail** due to 4 distinct defects in the `apr` conversion and inference pipeline.

**apr version:** 0.2.12
**QA framework:** apr-model-qa-playbook (commit 2f376d8)
**Model:** Qwen/Qwen2.5-Coder-0.5B-Instruct (494M params, SafeTensors canonical source)

## Certification Evidence

```
Scenarios: 31
Passed:    18  (all basic inference G1-G4 across 3 formats × 2 backends)
Failed:    15  (all conversion pipeline)
Pass rate: 58.1%
MQS Score: 405/1000
Grade:     F
Status:    BLOCKED
```

### Throughput (6-column profiling)

| Format | CPU (tok/s) | GPU (tok/s) |
|--------|-------------|-------------|
| GGUF | 2.9 | 252.7 |
| APR | 7.2 | 0.8 |
| SafeTensors | 0.0 | 0.0 |

## Defect 1: `apr rosetta convert` produces files with no extension (P0)

**Affects:** 6 conversion tests on CPU (F-CONV-G-A, F-CONV-A-G, F-CONV-G-S, F-CONV-S-G, F-CONV-A-S, F-CONV-S-A) + 1 golden rule test (F-GOLDEN-RULE-001)

**Error:**
```
Security error: Invalid model file extension: '.'. Expected one of: gguf, safetensors, apr, bin
```

**Root cause:** When `apr rosetta convert` produces a target file, the output path has a bare `.` extension (no format suffix). Subsequent inference on the converted file fails because `apr run` validates file extensions and rejects `.` as invalid.

**Reproduction:**
```bash
apr rosetta convert source.gguf target.apr   # produces file, but...
apr run --prompt "2+2" --max-tokens 5 target.apr  # fails: extension '.'
```

## Defect 2: `apr run` does not accept `--gpu` flag (P0)

**Affects:** 6 conversion tests on GPU (F-CONV-G-A, F-CONV-A-G, F-CONV-G-S, F-CONV-S-G, F-CONV-A-S, F-CONV-S-A)

**Error:**
```
error: unexpected argument '--gpu' found

  tip: to pass '--gpu' as a value, use '-- --gpu'

Usage: apr run --prompt <PROMPT> --max-tokens <MAX_TOKENS> <SOURCE>
```

**Root cause:** `apr run` subcommand does not have a `--gpu` flag in its clap argument definitions, but the QA framework passes `--gpu` for GPU backend tests. Either `apr run` needs `--gpu` or the QA framework's conversion tests need to use a different flag (e.g., `--device gpu`).

**Reproduction:**
```bash
apr run --prompt "2+2" --max-tokens 5 --gpu model.gguf
# error: unexpected argument '--gpu' found
```

## Defect 3: `apr rosetta convert` round-trip fails on extension detection (P1)

**Affects:** 2 round-trip tests (F-CONV-RT-001 on CPU and GPU)

**Error:**
```
Validation failed: Source inspection failed: Invalid model format: No file extension found
```

**Root cause:** The round-trip conversion chain (format A → B → A) produces intermediate files that lack proper extensions, causing the second conversion step to fail at source inspection.

## Defect 4: SafeTensors→GGUF conversion crashes on tensor size validation (P1)

**Not caught by certification** (uses pre-converted GGUF), but blocks the provenance chain.

**Error:**
```
Validation failed: Conversion failed: Invalid model format:
Tensor 'model.layers.15.self_attn.o_proj.weight' data exceeds file size
```

**Reproduction:**
```bash
apr rosetta convert model.safetensors model.gguf
# error: Tensor 'model.layers.15.self_attn.o_proj.weight' data exceeds file size
```

**Root cause:** The GGUF writer miscalculates tensor data offsets or file size when converting from SafeTensors for this specific model architecture (Qwen2, 24 layers, 896 embed dim). Fails consistently at layer 15's `o_proj.weight`.

## Impact

- **Certification blocked** for Qwen2.5-Coder-0.5B — cannot achieve PROVISIONAL (requires ≥90% pass rate, currently 58.1%)
- **Defects 1 and 2 affect all models** — the same conversion tests fail for all Qwen Coder sizes (1.5B, 3B, 7B, 14B, 32B are PROVISIONAL only because conversion test failures don't trigger G1-G4 gateway zeroing)
- **Provenance chain broken** — cannot verify SafeTensors→GGUF conversion fidelity (PROV-003)

## Suggested Fixes

1. **Defect 1:** Fix `rosetta convert` output path logic to preserve target file extension
2. **Defect 2:** Add `--gpu` / `--device` flag to `apr run` subcommand
3. **Defect 3:** Ensure intermediate files in conversion chains retain proper extensions
4. **Defect 4:** Fix GGUF writer tensor offset calculation for Qwen2 architecture

## QA Gate Reference

| Gate ID | Description | Status |
|---------|-------------|--------|
| F-CONV-G-A | GGUF→APR conversion fidelity | FAIL |
| F-CONV-A-G | APR→GGUF conversion fidelity | FAIL |
| F-CONV-G-S | GGUF→SafeTensors conversion fidelity | FAIL |
| F-CONV-S-G | SafeTensors→GGUF conversion fidelity | FAIL |
| F-CONV-A-S | APR→SafeTensors conversion fidelity | FAIL |
| F-CONV-S-A | SafeTensors→APR conversion fidelity | FAIL |
| F-CONV-RT-001 | Round-trip conversion (A→B→A) | FAIL |
| F-GOLDEN-RULE-001 | Golden rule (original format inference) | FAIL |

## Evidence

Full evidence JSON: `apr-model-qa-playbook/certifications/qwen2-5-coder-0-5b-instruct/evidence.json`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qwen2.5-Coder-0.5B MVP certification blocked: 4 conversion pipeline defects #196

Summary

Certification Evidence

Throughput (6-column profiling)

Defect 1: `apr rosetta convert` produces files with no extension (P0)

Defect 2: `apr run` does not accept `--gpu` flag (P0)

Defect 3: `apr rosetta convert` round-trip fails on extension detection (P1)

Defect 4: SafeTensors→GGUF conversion crashes on tensor size validation (P1)

Impact

Suggested Fixes

QA Gate Reference

Evidence

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Gate ID	Description	Status
F-CONV-G-A	GGUF→APR conversion fidelity	FAIL
F-CONV-A-G	APR→GGUF conversion fidelity	FAIL
F-CONV-G-S	GGUF→SafeTensors conversion fidelity	FAIL
F-CONV-S-G	SafeTensors→GGUF conversion fidelity	FAIL
F-CONV-A-S	APR→SafeTensors conversion fidelity	FAIL
F-CONV-S-A	SafeTensors→APR conversion fidelity	FAIL
F-CONV-RT-001	Round-trip conversion (A→B→A)	FAIL
F-GOLDEN-RULE-001	Golden rule (original format inference)	FAIL

Qwen2.5-Coder-0.5B MVP certification blocked: 4 conversion pipeline defects #196

Description

Summary

Certification Evidence

Throughput (6-column profiling)

Defect 1: apr rosetta convert produces files with no extension (P0)

Defect 2: apr run does not accept --gpu flag (P0)

Defect 3: apr rosetta convert round-trip fails on extension detection (P1)

Defect 4: SafeTensors→GGUF conversion crashes on tensor size validation (P1)

Impact

Suggested Fixes

QA Gate Reference

Evidence

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Defect 1: `apr rosetta convert` produces files with no extension (P0)

Defect 2: `apr run` does not accept `--gpu` flag (P0)

Defect 3: `apr rosetta convert` round-trip fails on extension detection (P1)