Skip to content

fix(providers): ignore zero context windows#1285

Merged
Aaronontheweb merged 2 commits into
netclaw-dev:devfrom
Aaronontheweb:fix/llamacpp-router-context-window
Jun 1, 2026
Merged

fix(providers): ignore zero context windows#1285
Aaronontheweb merged 2 commits into
netclaw-dev:devfrom
Aaronontheweb:fix/llamacpp-router-context-window

Conversation

@Aaronontheweb

@Aaronontheweb Aaronontheweb commented Jun 1, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Treat provider-reported non-positive context windows as unknown instead of valid runtime limits
  • Ignore llama.cpp router-level /props placeholders so they do not override per-model metadata or block downstream modality detection
  • Harden composite and final model capability resolution against zero context values
  • Soften the configured-vs-detected context window check from a hard startup failure to a warning (see below)

Fixes #1280.

Context window mismatch: warn, don't crash

Previously, when Models:Main:ContextWindow exceeded the provider-reported effective context window, ModelCapabilityResolution threw InvalidOperationException and the daemon refused to start. That's the wrong failure mode:

  • Provider-reported windows are unreliable — router placeholders, n_ctx=0 sentinels, and llama.cpp started with a larger --ctx-size than it advertises. This is the same unreliability the rest of this PR works around, so gating startup on that number takes down every session over a value we can't trust.
  • Overrun is already a handled runtime condition. If the configured window really is too large, the provider rejects the oversized request and the session compacts-and-retries (LlmSessionActor), surfacing an actionable per-turn error — not a daemon-wide boot failure.

So we now honor the operator's configured value and log a Warning at startup instead of throwing. The Netclaw.Startup logger is threaded into the resolver so the warning surfaces.

Validation

  • dotnet test src/Netclaw.Daemon.Tests
  • dotnet test src/Netclaw.Cli.Tests
  • dotnet slopwatch analyze
  • pwsh ./scripts/Add-FileHeaders.ps1 -Verify

@Aaronontheweb Aaronontheweb added bug Something isn't working providers Provider integrations and capability detection across OpenAI-compatible backends. config Configuration issues, netclaw doctor, schema validation. labels Jun 1, 2026
@Aaronontheweb

Copy link
Copy Markdown
Collaborator Author

Review — fix(providers): ignore zero context windows

Verdict: LGTM. Clean, well-scoped fix that resolves #1280 (llama.cpp router-mode n_ctx=0 treated as a valid context window) with defense in depth across the resolution chain. (Can't formally approve my own PR, so leaving the review as a comment.)

What it does well

  • Right layer + safety net. Normalizes non-positive context metadata to null at every parse site (ProbeHelpers.TryReadPositiveInt32, vLLM/llama.cpp/Ollama/OpenRouter/OpenAI-compatible descriptors) and hardens the final ModelCapabilityResolution so the configured > detected validation can no longer throw on a 0 sentinel. The is > 0 guard is the correct fix for the is int bug called out in the issue.
  • CompositeCapabilityResolver fix is the subtle one. Changing contextWindowTokens ??= result.ContextWindowTokens to skip non-positive values means an early resolver reporting 0 no longer poisons the field and blocks a later resolver from supplying the real limit. Good catch — the comment captures the "0 is a sentinel, not a value" invariant clearly.
  • Router /props handling is correct. IsRouterProps (role == "router") prevents the router-level placeholder body from overriding per-model metadata, and modalities now default to null rather than forcing Text — which lets downstream HuggingFace/OpenRouter resolvers fill in image support instead of being short-circuited. The Parse_UsesMetaNCtx_WhenPropsAbsent assertion change (Text → null) is the intentional, correct consequence.
  • TryReadPositiveInt32OrFallbackWhenMissing semantics are deliberate and documented: an explicit n_ctx=0 is "unknown, do not promote n_ctx_train," while a missing n_ctx falls back to train context. This avoids overstating runtime capacity, and the matrix is pinned by tests (Parse_ZeroMetaNCtx_… vs Parse_UsesMetaTrainContext_WhenMetaNCtxAbsent).

Notes (non-blocking)

  • Test coverage is thorough — zero/missing/all-zero permutations for llama.cpp, vLLM, Ollama, OpenRouter, the composite chain, and the final resolution. No gaps worth flagging.
  • No config-schema changes needed (no new *Config properties), and no system-skill mapping is touched. Consistent with the "no silent fallbacks" rule: zero context is surfaced as unknown and falls through to detection/default, not silently coerced to a bogus runtime limit.

Verification

  • dotnet test src/Netclaw.Daemon.Tests --filter "Provider|ModelCapabilityResolution"183 passed, 0 failed locally.

Labeled: bug, providers, config.

@Aaronontheweb

Copy link
Copy Markdown
Collaborator Author

Claude approves, lol, but I'll need to look at it myself when I get a moment

@Aaronontheweb Aaronontheweb left a comment

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finding some issues...

// Final safety net: provider parsers should normalize non-positive context
// metadata to null, but keep runtime capabilities valid even if an older
// or custom resolver reports llama.cpp-style n_ctx=0 as a sentinel.
var detectedContextWindow = detected?.ContextWindowTokens is > 0

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

null if the value is not greater than 0

Comment thread src/Netclaw.Daemon/Configuration/ModelCapabilityResolution.cs Outdated
…smatch

When Models:Main:ContextWindow exceeds the provider-reported effective
context window, ResolveModelCapabilities previously threw at startup,
taking down the whole daemon. But provider-reported windows are
unreliable (router placeholders, n_ctx=0 sentinels, llama.cpp started
with a larger --ctx-size than it advertises) -- the same unreliability
this PR already works around elsewhere.

Honor the operator's configured value and log a warning instead. If the
configured window really is too large, the provider rejects the oversized
request at runtime and the session compacts-and-retries (LlmSessionActor),
surfacing an actionable per-turn error rather than a daemon-wide boot
failure.
@Aaronontheweb Aaronontheweb marked this pull request as ready for review June 1, 2026 21:45

@Aaronontheweb Aaronontheweb left a comment

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at some more potential sticklers

// provider rejects the oversized request at runtime and the session
// compacts-and-retries (see LlmSessionActor), surfacing an actionable
// per-turn error rather than a daemon-wide startup failure.
logger?.LogWarning(

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a much better approach - log the inconsistency and then try to work with it anyway.

// Context windows are positive-only. A 0 from provider metadata is an
// unknown sentinel, not a resolved value, and must not block a later
// resolver from supplying the real limit.
if (contextWindowTokens is null && result.ContextWindowTokens is > 0)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens if both values are zero here?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accumulator is null-or-positive by construction (only assigned under is > 0), so the real case is null accumulator + 0 result → guard fails, the 0 is ignored, and null propagates to the 32k default downstream. Working as intended.

return ProbeHelpers.TryReadInt32(meta, "n_ctx") ??
ProbeHelpers.TryReadInt32(meta, "n_ctx_train");
return ProbeHelpers.TryReadPositiveInt32OrFallbackWhenMissing(
meta, "n_ctx", "n_ctx_train");

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try to read multiple possible properties where the context window sizes may be stored

@Aaronontheweb Aaronontheweb merged commit 2f0d113 into netclaw-dev:dev Jun 1, 2026
14 checks passed
@Aaronontheweb Aaronontheweb deleted the fix/llamacpp-router-context-window branch June 1, 2026 22:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working config Configuration issues, netclaw doctor, schema validation. providers Provider integrations and capability detection across OpenAI-compatible backends.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: llama.cpp router mode reports n_ctx=0, treated as valid context window

1 participant