openai-compatible discovery defaults modalities to Text, then the CLI persists it as a permanent override

## What happens

Assign a model from an `openai-compatible` provider (llama.cpp, vLLM, llama-server, and friends) with `netclaw model set`, and the saved config comes back with `"InputModalities": "Text"` and `"OutputModalities": "Text"` — even when the model handles images or video. Once that value is on disk the daemon never corrects it. A config-level modality is treated as authoritative and short-circuits capability resolution at startup, so the vLLM/llama.cpp backend strategy, the OpenRouter oracle, and the HuggingFace resolver never get a vote.

The visible result is a multimodal model that quietly behaves as text-only. Image attachments get dropped before they ever reach the model, and `netclaw daemon status` reports `input: Text`.

## Why

The OpenAI-style `/v1/models` listing has no modality field, so the parser never sets one. It builds each `DiscoveredModel` with an id and a context window and nothing else:

- https://github.com/netclaw-dev/netclaw/blob/60601c6cc82cdffbc38c61a062aac28d2d8b3444/src/Netclaw.Providers/ProbeHelpers.cs#L36-L43
- (openai-compatible routes straight through that helper) https://github.com/netclaw-dev/netclaw/blob/60601c6cc82cdffbc38c61a062aac28d2d8b3444/src/Netclaw.Providers/SelfHosted/OpenAiCompatibleDescriptor.cs#L50-L51

`DiscoveredModel` then falls back to its defaults, which are `Text`:

- https://github.com/netclaw-dev/netclaw/blob/60601c6cc82cdffbc38c61a062aac28d2d8b3444/src/Netclaw.Configuration/DiscoveredModel.cs#L31-L34

So discovery hands back `Text` not because it detected a text-only model, but because it never looked. The persistence step then takes that default and writes it as if an operator had deliberately chosen it:

- https://github.com/netclaw-dev/netclaw/blob/60601c6cc82cdffbc38c61a062aac28d2d8b3444/src/Netclaw.Cli/Model/ModelCommand.cs#L183-L186

That is the trap. An unknown gets promoted to a hard assertion, and on the next daemon boot the override beats real detection.

## Same bug, second code path

The init wizard does the identical thing through its own code, so a fix in one place won't cover the other:

- https://github.com/netclaw-dev/netclaw/blob/60601c6cc82cdffbc38c61a062aac28d2d8b3444/src/Netclaw.Cli/Tui/Wizard/Steps/ProviderStepViewModel.cs#L319-L329

Worth fixing both together, and ideally collapsing them onto one shared write path.

## Suggested direction

- When discovery can't actually determine modalities, leave them unset rather than defaulting to `Text`, and don't persist a value the provider never reported. Unknown should mean "let the daemon resolve this," not "Text, forever."
- Only write modalities to config when they come from a source that genuinely knows. The OpenAI Codex OAuth catalog, for example, already resolves real `input_modalities`/`output_modalities`; a self-hosted `/v1/models` listing does not.

## Repro

1. Configure an `openai-compatible` provider pointed at a server hosting a vision-capable model.
2. `netclaw model set Main <provider> <model-id>`
3. Look at the saved model entry: `InputModalities` is `Text`.
4. `netclaw daemon status` reports text-only input, and image attachments are dropped, regardless of the model's real capability.

## Related

#1267 is about surfacing modalities in `model discover`/`list`/TUI. This is the upstream cause of the bad data it would surface: discovery defaults modalities to `Text` and then bakes them into config as an override.


	if (model.TryGetProperty("id", out var id))
	{
	models.Add(new DiscoveredModel
	{
	ModelId = new(id.GetString()!),
	ContextWindowTokens = readContextWindow(model)
	});
	}


	builder.Model = new ModelConfigSection
	{
	Provider = providerName,
	ModelId = SelectedModelId,
	ContextWindow = selectedModel?.ContextWindowTokens,
	Provenance = selectedModel is null ? ModelDiscoverySource.Manual : ModelDiscoverySource.Live,
	InputModalities = selectedModel?.InputModalities,
	OutputModalities = selectedModel?.OutputModalities,
	};
	}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

openai-compatible discovery defaults modalities to Text, then the CLI persists it as a permanent override #1290

What happens

Why

Same bug, second code path

Suggested direction

Repro

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

	internal static ProviderProbeResult ParseModels(string json)
	=> ProbeHelpers.ParseOpenAiStyleModels(json, TryReadContextWindow);

	public ModelModality InputModalities { get; init; } = ModelModality.Text;

	/// <summary>Content types the model can produce as output.</summary>
	public ModelModality OutputModalities { get; init; } = ModelModality.Text;

	if (discoveredModel is not null)
	{
	modelEntry["InputModalities"] = discoveredModel.InputModalities.ToString();
	modelEntry["OutputModalities"] = discoveredModel.OutputModalities.ToString();

openai-compatible discovery defaults modalities to Text, then the CLI persists it as a permanent override #1290

Description

What happens

Why

Same bug, second code path

Suggested direction

Repro

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions