feat: auto-discover LoRA adapters from models directory

## Summary

Add auto-discovery of LoRA adapter GGUF files alongside models, so that adapters placed in the `--models-dir` directory (or configured in `--models-preset` INI) are automatically detected and made available to the frontend and API without requiring explicit `--lora` CLI flags.

Related: #13 (frontend LoRA toggler depends on adapters being discoverable)

## Can we deduce adapter↔model relationships from the files?

### What metadata is available

**LoRA adapter GGUF files contain:**
| Field | Value | Source |
|-------|-------|--------|
| `general.type` | `"adapter"` | `convert_lora_to_gguf.py:375` |
| `general.architecture` | e.g. `"llama"`, `"qwen2"` | `convert_lora_to_gguf.py` (inherited from base) |
| `adapter.type` | `"lora"` | `convert_lora_to_gguf.py:376` |
| `adapter.lora.alpha` | float | `convert_lora_to_gguf.py:380` |
| Tensor names + shapes | e.g. `blk.0.attn_q.lora_a` | From training |

**Model GGUF files contain:**
| Field | Value |
|-------|-------|
| `general.architecture` | e.g. `"llama"`, `"qwen2"` |
| `general.name` | e.g. `"Llama 3.1 8B Instruct"` |
| Tensor names + shapes | Full model tensors |

### What's matchable

| Match criterion | Reliability | Requires tensor load? |
|----------------|-------------|----------------------|
| `general.architecture` must match | **Necessary but not sufficient** — all Llama models share `"llama"` | No (metadata only) |
| Tensor name compatibility | **Strong signal** — adapter tensor names must exist in model | Yes (header scan) |
| Tensor shape compatibility | **Definitive** — dimension mismatch = incompatible | Yes (header scan) |

### What's NOT in the metadata

- **No base model identifier** — the conversion script reads `base_model_name_or_path` from `adapter_config.json` but **does not embed it** in the GGUF (`convert_lora_to_gguf.py:335-345`)
- **No base model hash/UUID** — the GGUF spec defines `general.base_model.{id}.uuid` and `general.base_model.{id}.name` fields, and the writer has `add_base_model_*()` methods (`gguf_writer.py:608-636`), but **neither `convert_hf_to_gguf.py` nor `convert_lora_to_gguf.py` actually writes them**
- **No model size/layer count** stored explicitly in adapter metadata

### Conclusion

**Architecture-level matching is feasible from metadata alone** (read `general.architecture` from both files — cheap, no tensor loading). This narrows candidates but can't distinguish e.g. Llama-7B from Llama-70B.

**Exact matching requires loading tensor headers** from both files to compare names and shapes. This is what `llama-adapter.cpp:330-368` already does at runtime — it throws `"maybe wrong base model?"` on mismatch.

**Best practical approach for auto-discovery:**
1. Scan directory, identify adapters by `general.type = "adapter"` (metadata read only)
2. Group adapters by `general.architecture` — show only adapters matching the loaded model's architecture
3. Optionally validate tensor compatibility on first load attempt (the runtime already does this and gives clear errors)
4. Long-term: contribute upstream to have `convert_lora_to_gguf.py` write `general.base_model.0.name` — the writer infrastructure already exists

## Implementation Plan

### Phase 1: Discovery in `preset.cpp`

Modify `load_from_models_dir()` (`common/preset.cpp:382-445`) to also scan for LoRA adapters:

- Currently it scans for `.gguf` files and treats them all as models
- Add GGUF metadata read for `general.type` — if `"adapter"`, categorize as LoRA adapter instead of model
- Also read `general.architecture` from adapter files for matching
- Store discovered adapters in a separate list/map alongside models
- May need a lightweight GGUF metadata reader (the full model loader is too heavy for scanning)

**Key consideration:** `load_from_models_dir()` currently does zero GGUF parsing — it only looks at filenames. Adding metadata reads means opening each file, which has performance implications for large directories. A naming convention fallback (e.g. files in a `loras/` subdirectory, or `*-lora-*.gguf` pattern) could supplement or replace metadata scanning.

### Phase 2: INI preset support

Add `lora` and `lora-scaled` as valid keys in the INI preset parser (`common/preset.cpp:310-360`):

```ini
[my-model]
model = /path/to/model.gguf
lora = /path/to/adapter1.gguf,/path/to/adapter2.gguf
```

This is straightforward since the INI parser already maps keys to CLI argument names, and `--lora` is already a valid CLI arg.

### Phase 3: Router integration

- Pass discovered LoRA adapters to child processes when spawning models in router mode (`server-models.cpp:561`)
- Add a new API endpoint or extend `GET /v1/models` to include available adapters per model
- Expose adapter availability so the frontend (#13) can conditionally show the LoRA UI

### Optional: `--lora-dir` flag

Add a dedicated `--lora-dir PATH` argument (parallel to `--models-dir`) for cases where adapters are stored separately from models. Would be added at `common/arg.cpp` around line 3030.

## File Reference

| Component | File | Key Lines |
|-----------|------|-----------|
| Model directory scanner | `common/preset.cpp` | 382-445 |
| INI preset parser | `common/preset.cpp` | 310-360 |
| LoRA adapter loading & validation | `src/llama-adapter.cpp` | 165-239 (metadata), 330-368 (tensor validation) |
| LoRA arch keys | `src/llama-arch.cpp` | 136-137, 318-322 |
| GGUF adapter constants | `gguf-py/gguf/constants.py` | 76-85, 280-285 |
| GGUF base_model writer methods (unused) | `gguf-py/gguf/gguf_writer.py` | 608-636 |
| LoRA conversion (doesn't write base_model) | `convert_lora_to_gguf.py` | 335-345, 374-403 |
| Router model management | `tools/server/server-models.cpp` | 242-375 |
| LoRA CLI args | `common/arg.cpp` | 2473-2496 |
| Router CLI args | `common/arg.cpp` | 3004-3030 |

Field	Value	Source
`general.type`	`"adapter"`	`convert_lora_to_gguf.py:375`
`general.architecture`	e.g. `"llama"`, `"qwen2"`	`convert_lora_to_gguf.py` (inherited from base)
`adapter.type`	`"lora"`	`convert_lora_to_gguf.py:376`
`adapter.lora.alpha`	float	`convert_lora_to_gguf.py:380`
Tensor names + shapes	e.g. `blk.0.attn_q.lora_a`	From training

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: auto-discover LoRA adapters from models directory #14

Summary

Can we deduce adapter↔model relationships from the files?

What metadata is available

What's matchable

What's NOT in the metadata

Conclusion

Implementation Plan

Phase 1: Discovery in `preset.cpp`

Phase 2: INI preset support

Phase 3: Router integration

Optional: `--lora-dir` flag

File Reference

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Field	Value
`general.architecture`	e.g. `"llama"`, `"qwen2"`
`general.name`	e.g. `"Llama 3.1 8B Instruct"`
Tensor names + shapes	Full model tensors

Match criterion	Reliability	Requires tensor load?
`general.architecture` must match	Necessary but not sufficient — all Llama models share `"llama"`	No (metadata only)
Tensor name compatibility	Strong signal — adapter tensor names must exist in model	Yes (header scan)
Tensor shape compatibility	Definitive — dimension mismatch = incompatible	Yes (header scan)

Component	File	Key Lines
Model directory scanner	`common/preset.cpp`	382-445
INI preset parser	`common/preset.cpp`	310-360
LoRA adapter loading & validation	`src/llama-adapter.cpp`	165-239 (metadata), 330-368 (tensor validation)
LoRA arch keys	`src/llama-arch.cpp`	136-137, 318-322
GGUF adapter constants	`gguf-py/gguf/constants.py`	76-85, 280-285
GGUF base_model writer methods (unused)	`gguf-py/gguf/gguf_writer.py`	608-636
LoRA conversion (doesn't write base_model)	`convert_lora_to_gguf.py`	335-345, 374-403
Router model management	`tools/server/server-models.cpp`	242-375
LoRA CLI args	`common/arg.cpp`	2473-2496
Router CLI args	`common/arg.cpp`	3004-3030

feat: auto-discover LoRA adapters from models directory #14

Description

Summary

Can we deduce adapter↔model relationships from the files?

What metadata is available

What's matchable

What's NOT in the metadata

Conclusion

Implementation Plan

Phase 1: Discovery in preset.cpp

Phase 2: INI preset support

Phase 3: Router integration

Optional: --lora-dir flag

File Reference

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Phase 1: Discovery in `preset.cpp`

Optional: `--lora-dir` flag