lm_head.weight not excluded from quantization (same bug as embeddings)

## Bug Report

**Source**: `tiny-model-ground-truth` parity checker (0/59 passing)
**Severity**: Critical — blocks ALL inference for LLaMA-style and Qwen architectures
**Related**: Follow-up to #231 and #232 (embedding fix applied, but lm_head has identical bug)

## Description

The GH-231/232 fix correctly added embeddings to `should_skip_quant()` in `add_f32_tensor_to_writer()`, but `lm_head.weight` was not included. The lm_head exhibits the exact same corruption patterns the embeddings had before the fix:

- **Int8**: Shape mismatch (7M vs 28M elements) — quantized bytes stored as f32 without accounting for packing ratio
- **Int4**: All-zero data (100% density failure) — data written at wrong offset or not written

## Affected Models

| Model | Quant | Tensor | Expected Elements | Got Elements | Error |
|-------|-------|--------|-------------------|-------------|-------|
| SmolLM-135M | Int8 | lm_head.weight | 28,311,552 | 7,077,889 | F-LAYOUT-CONTRACT-001 shape mismatch |
| SmolLM-135M | Int4 | lm_head.weight | 28,311,552 | 28,311,552 | F-DATA-QUALITY-001 100% zeros |
| Qwen2-0.5B | Int8 | lm_head.weight | 136,134,656 | 34,033,665 | F-LAYOUT-CONTRACT-001 shape mismatch |
| Qwen2-0.5B | Int4 | lm_head.weight | 136,134,656 | 136,134,656 | F-DATA-QUALITY-001 100% zeros |

## Error Output (SmolLM Int8)

```
[APR-LOAD] Embedding tensor 'model.embed_tokens.weight': dims=[49152, 576]
[APR-LOAD] Token 0 embedding sample: [-0.3789, -0.2188, 0.0276, -0.2617, -0.2314]  ← FIXED, good data
[APR-LOAD] Embedding loaded: 28311552 elements  ← FIXED, correct count

[APR-LOAD] LM head tensor 'lm_head.weight': dims=[49152, 576], dtype=9
[APR-LOAD] LM head loaded: 7077889 elements  ← BROKEN, same 4:1 ratio bug
error: F-LAYOUT-CONTRACT-001 Tensor 'lm_head_weight': Shape mismatch: got 7077889, expected 28311552
```

## Fix

Add `lm_head` to `should_skip_quant()` in `src/format/converter/write.rs`. The lm_head is a tied weight that mirrors the embedding — it should never be quantized.

Pattern to match: `lm_head`, `lm_head.weight`, `output.weight`

## Reproduction

```bash
cd tiny-model-ground-truth
make clean && make convert
apr run models/smollm-135m-int8.apr -p "Hello" -n 32 --json
# → F-LAYOUT-CONTRACT-001 on lm_head_weight
```

## Environment

- `apr` v0.2.16 (f39b7dfd) + GH-231/232 embedding fix applied
- Platform: Linux x86_64

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lm_head.weight not excluded from quantization (same bug as embeddings) #234

Bug Report

Description

Affected Models

Error Output (SmolLM Int8)

Fix

Reproduction

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Model	Quant	Tensor	Expected Elements	Got Elements	Error
SmolLM-135M	Int8	lm_head.weight	28,311,552	7,077,889	F-LAYOUT-CONTRACT-001 shape mismatch
SmolLM-135M	Int4	lm_head.weight	28,311,552	28,311,552	F-DATA-QUALITY-001 100% zeros
Qwen2-0.5B	Int8	lm_head.weight	136,134,656	34,033,665	F-LAYOUT-CONTRACT-001 shape mismatch
Qwen2-0.5B	Int4	lm_head.weight	136,134,656	136,134,656	F-DATA-QUALITY-001 100% zeros

lm_head.weight not excluded from quantization (same bug as embeddings) #234

Description

Bug Report

Description

Affected Models

Error Output (SmolLM Int8)

Fix

Reproduction

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions