fix(providers): ignore zero context windows#1285
Conversation
Review — fix(providers): ignore zero context windowsVerdict: LGTM. Clean, well-scoped fix that resolves #1280 (llama.cpp router-mode What it does well
Notes (non-blocking)
Verification
Labeled: |
|
Claude approves, lol, but I'll need to look at it myself when I get a moment |
Aaronontheweb
left a comment
There was a problem hiding this comment.
Finding some issues...
| // Final safety net: provider parsers should normalize non-positive context | ||
| // metadata to null, but keep runtime capabilities valid even if an older | ||
| // or custom resolver reports llama.cpp-style n_ctx=0 as a sentinel. | ||
| var detectedContextWindow = detected?.ContextWindowTokens is > 0 |
There was a problem hiding this comment.
null if the value is not greater than 0
…smatch When Models:Main:ContextWindow exceeds the provider-reported effective context window, ResolveModelCapabilities previously threw at startup, taking down the whole daemon. But provider-reported windows are unreliable (router placeholders, n_ctx=0 sentinels, llama.cpp started with a larger --ctx-size than it advertises) -- the same unreliability this PR already works around elsewhere. Honor the operator's configured value and log a warning instead. If the configured window really is too large, the provider rejects the oversized request at runtime and the session compacts-and-retries (LlmSessionActor), surfacing an actionable per-turn error rather than a daemon-wide boot failure.
Aaronontheweb
left a comment
There was a problem hiding this comment.
Looking at some more potential sticklers
| // provider rejects the oversized request at runtime and the session | ||
| // compacts-and-retries (see LlmSessionActor), surfacing an actionable | ||
| // per-turn error rather than a daemon-wide startup failure. | ||
| logger?.LogWarning( |
There was a problem hiding this comment.
this is a much better approach - log the inconsistency and then try to work with it anyway.
| // Context windows are positive-only. A 0 from provider metadata is an | ||
| // unknown sentinel, not a resolved value, and must not block a later | ||
| // resolver from supplying the real limit. | ||
| if (contextWindowTokens is null && result.ContextWindowTokens is > 0) |
There was a problem hiding this comment.
what happens if both values are zero here?
There was a problem hiding this comment.
Accumulator is null-or-positive by construction (only assigned under is > 0), so the real case is null accumulator + 0 result → guard fails, the 0 is ignored, and null propagates to the 32k default downstream. Working as intended.
| return ProbeHelpers.TryReadInt32(meta, "n_ctx") ?? | ||
| ProbeHelpers.TryReadInt32(meta, "n_ctx_train"); | ||
| return ProbeHelpers.TryReadPositiveInt32OrFallbackWhenMissing( | ||
| meta, "n_ctx", "n_ctx_train"); |
There was a problem hiding this comment.
try to read multiple possible properties where the context window sizes may be stored
Summary
Fixes #1280.
Context window mismatch: warn, don't crash
Previously, when
Models:Main:ContextWindowexceeded the provider-reported effective context window,ModelCapabilityResolutionthrewInvalidOperationExceptionand the daemon refused to start. That's the wrong failure mode:n_ctx=0sentinels, and llama.cpp started with a larger--ctx-sizethan it advertises. This is the same unreliability the rest of this PR works around, so gating startup on that number takes down every session over a value we can't trust.LlmSessionActor), surfacing an actionable per-turn error — not a daemon-wide boot failure.So we now honor the operator's configured value and log a
Warningat startup instead of throwing. TheNetclaw.Startuplogger is threaded into the resolver so the warning surfaces.Validation