fix(providers): stop persisting guessed modalities; fix probe timeout & visibility (#1290, #1292) by Aaronontheweb · Pull Request #1311 · netclaw-dev/netclaw

Aaronontheweb · 2026-06-03T08:42:45Z

Fixes #1290 and #1292. Both are symptoms of one pattern in the openai-compatible discovery → probe → persist pipeline: a silent fallback ordered ahead of the source that should win. Fixed on one shared write path.

#1290 — modalities silently default to `Text`, then become a permanent override

DiscoveredModel.{Input,Output}Modalities are now ModelModality? (default null = "provider didn't report it") instead of non-nullable defaulting to Text, so "unknown" is distinguishable from "known text-only".
New shared ModelEntryWriter.BuildModelEntry is the single persist path; model set, the init wizard, and the TUI model manager all route through it and write modalities only when known. (Previously three hand-rolled copies — which is how the same bug landed in multiple places.)
ParseCodexModels no longer substitutes Text for an absent output_modalities.
Net effect: an openai-compatible/Ollama model persists no modalities → the daemon resolves real capabilities at runtime instead of being short-circuited by a baked-in Text.

#1292 — probe blocks silently / hard-fails at 10s

Ordering bug: the wizard stacked a 20s .WaitAsync over ExecuteProbeAsync's inner 10s CancelAfter; the inner timeout returned a normal failed result first, so the outer deadline + its TimeoutException handler were dead code (advertised 20s, failed at 10s). The same 20s wrapper also lived in ModelManagerViewModel and ProviderManagerViewModel, where — after the self-hosted timeout was raised — it would have truncated the longer budget.
Centralized all probe budgets in one ProbeTimeouts (Default 10s, SelfHosted 30s, InteractiveWallClock 45s, documented to stay > SelfHosted). Self-hosted descriptors use the 30s per-request budget; the three interactive callers apply the 45s wall-clock, which bounds pre-request work (e.g. OAuth token exchange) without truncating a slow /models probe.
Honest timeout message that names the actual endpoint and reads "may be slow/loading" rather than a hard failure; model discover/set surface the endpoint being probed and an elapsed-time line.

Review-driven follow-ups (2nd commit)

A max-effort review caught that the timeout de-stacking was initially incomplete (only the wizard) and that ProbeProgressReporter had disposal/erase issues; both addressed.

Validation

Netclaw.Configuration.Tests (366), Netclaw.Cli.Tests (787), Netclaw.Daemon.Tests provider/capability (100) — all pass. dotnet slopwatch analyze: 0 issues. Copyright headers verified.
Interactive smoke (init-wizard tape, isolated Linux container): passes end-to-end. netclaw doctor reports [PASS] Config Schema and [PASS] Context Window: Auto-detected 32,768 tokens for qwen2:0.5b (from running daemon) — confirming the openai-compatible discovery defaults modalities to Text, then the CLI persists it as a permanent override #1290 fix: with no modalities baked into config, the daemon resolves capabilities at runtime as intended.

Not included

Cosmetic probe-spinner animation improvements were intentionally dropped from this PR and tracked separately.

…meout (netclaw-dev#1290, netclaw-dev#1292) Both issues are symptoms of the same pattern: a silent fallback ordered ahead of the source that should win. Fix them on one shared write path. netclaw-dev#1290 — modalities silently default to Text, then become a permanent override - DiscoveredModel.{Input,Output}Modalities are now ModelModality? (default null = "provider didn't report it") instead of non-nullable defaulting to Text, so "unknown" is distinguishable from "known text-only". - New shared ModelEntryWriter.BuildModelEntry writes modalities only when known; model set, the init wizard, and the TUI model manager all route through it instead of hand-rolling three drifting copies. openai-compatible/Ollama now persist no modalities, so the daemon resolves real capabilities at runtime instead of being short-circuited by a baked-in Text. netclaw-dev#1292 — probe blocks silently and hard-fails at 10s - The wizard stacked a 20s WaitAsync over ExecuteProbeAsync's inner 10s CancelAfter; the inner timeout returned a normal failed result first, so the outer deadline and its TimeoutException handler were dead code (advertised 20s, failed at 10s). Removed the stacking — one authoritative timeout. - Self-hosted descriptors now pass a 30s probe timeout (cold vLLM/llama.cpp can legitimately need it); cloud stays at 10s. The timeout message names the actual endpoint and elapsed seconds and reads "may be slow/loading" rather than a hard failure. - model discover/set print the endpoint being probed (surfacing the silent localhost default) and a stderr elapsed-time ticker so slow != stuck. Regression guards: parser leaves modalities null; shared writer omits unknown modalities (and a flipped ModelManagerViewModel test that previously asserted the bug); wizard persists none for openai-compatible; Codex still persists real modalities; probe honors the supplied timeout and names the endpoint.

…w-dev#1290, netclaw-dev#1292) A max-effort review caught that the timeout de-stacking was incomplete and the same drift this PR set out to kill. Timeout (the high-severity miss): - The 20s outer .WaitAsync lived in THREE callers; the first pass de-stacked only the wizard, so ModelManagerViewModel and ProviderManagerViewModel still wrapped the probe in a 20s deadline that, after the inner self-hosted timeout was raised to 30s, actively truncated it — re-introducing the exact false "timed out" failure netclaw-dev#1292 fixed, in two TUIs. - De-stacking the wizard also dropped the only wall-clock bound covering pre-request work (OAuth token exchange) that the inner per-request timeout does not cover, leaving it bounded only by HttpClient (~100s). - Fix: centralize all probe budgets in one public ProbeTimeouts (Default 10s, SelfHosted 30s, InteractiveWallClock 45s, documented to stay > SelfHosted). All three interactive callers apply InteractiveWallClock as a whole-probe wall-clock that bounds pre-request work without truncating a slow self-hosted /models probe. No more per-view-model copies to drift. Codex modalities: - ParseCodexModels still substituted Text for an absent output_modalities — the same "persist a guessed Text" pathology the nullable contract forbids. Now passes modalities through as reported (input stays guarded by the fail-loud missing check); output omitted -> null -> daemon resolves at runtime. ProbeProgressReporter robustness/cleanup: - DisposeAsync now disposes the CTS in a finally and swallows terminal I/O faults, so a window/SSH closed mid-probe can't mask the real probe result. - Erase blanks exactly the width drawn (tracked) instead of a fixed 44, so long endpoints leave no residue. - Dropped the dead err/time injection seams (gated on Console.IsErrorRedirected, unreachable under any test runner). ModelCommand.ResolveProbeEndpoint now TrimEnd('/')s so the surfaced endpoint matches the one ExecuteProbeAsync actually probes.

Aaronontheweb · 2026-06-03T12:09:12Z

Follow-up tracked in #1312 — standardizing a self-animating Termina spinner component for layout-node views. The cosmetic probe-spinner animation work (and the re-entrant-redraw footgun surfaced here) lives there, intentionally kept out of this PR.

Match OllamaDescriptor, which already passes timeout: ProbeTimeouts.SelfHosted by name. The OpenAI-compatible call site passed it positionally as a bare trailing arg after ct, which read as two ambiguous timeout-ish values. No behavior change.

Aaronontheweb

LGTM but will need to put it through its paces in our beta channel first

Aaronontheweb · 2026-06-03T12:24:41Z

+public class ModelEntryWriterTests
+{
+    [Fact]
+    public void BuildModelEntry_UnknownModalities_OmitsModalityKeys()


Aaronontheweb added 2 commits June 3, 2026 08:38

Aaronontheweb mentioned this pull request Jun 3, 2026

Standardize a self-animating spinner component for Termina layout-node views #1312

Closed

Aaronontheweb added tui Terminal UI (Termina) issues bug Something isn't working config Configuration issues, netclaw doctor, schema validation. labels Jun 3, 2026

Aaronontheweb commented Jun 3, 2026

View reviewed changes

Aaronontheweb enabled auto-merge (squash) June 3, 2026 12:35

Aaronontheweb merged commit 87e51cc into netclaw-dev:dev Jun 3, 2026
14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(providers): stop persisting guessed modalities; fix probe timeout & visibility (#1290, #1292)#1311

fix(providers): stop persisting guessed modalities; fix probe timeout & visibility (#1290, #1292)#1311
Aaronontheweb merged 3 commits into
netclaw-dev:devfrom
Aaronontheweb:fix/1290-1292-provider-discovery

Aaronontheweb commented Jun 3, 2026

Uh oh!

Aaronontheweb commented Jun 3, 2026

Uh oh!

Aaronontheweb left a comment

Uh oh!

Aaronontheweb Jun 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Aaronontheweb commented Jun 3, 2026

#1290 — modalities silently default to Text, then become a permanent override

#1292 — probe blocks silently / hard-fails at 10s

Review-driven follow-ups (2nd commit)

Validation

Not included

Uh oh!

Aaronontheweb commented Jun 3, 2026

Uh oh!

Aaronontheweb left a comment

Choose a reason for hiding this comment

Uh oh!

Aaronontheweb Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

#1290 — modalities silently default to `Text`, then become a permanent override