Skip to content

fix(copilot): wire live /models max_prompt_tokens into context-window resolver#12840

Closed
difujia wants to merge 1 commit into
NousResearch:mainfrom
difujia:fix/copilot-live-context-windows
Closed

fix(copilot): wire live /models max_prompt_tokens into context-window resolver#12840
difujia wants to merge 1 commit into
NousResearch:mainfrom
difujia:fix/copilot-live-context-windows

Conversation

@difujia

@difujia difujia commented Apr 20, 2026

Copy link
Copy Markdown
Contributor

Addresses part 1 of #7731, as scoped by @konsisumer in #7731 (comment).

Problem

The Copilot provider resolves context windows via models.dev static data, which does not include account-specific models (e.g. claude-opus-4.6-1m with 1M context). These models fall through to the 128K default.

Solution

Wire capabilities.limits.max_prompt_tokens from the live Copilot /models API into the context-window resolver as a higher-priority source than models.dev.

Changes:

  • hermes_cli/models.py: New get_copilot_model_context(model_id) helper that extracts max_prompt_tokens from the cached catalog. Results cached in-process for 1 hour.
  • agent/model_metadata.py: New step 5a in get_model_context_length() queries the live API for copilot/copilot-acp/github-copilot providers before falling through to models.dev (step 5b).
  • tests/hermes_cli/test_copilot_context.py: 11 tests covering cache behavior, edge cases (missing limits, zero values, empty catalog), and integration with get_model_context_length().

Design notes

  • The existing step 2 comment explains why Copilot /models was previously skipped: it returns provider-imposed limits (e.g. 128K) rather than native model context (e.g. 400K). This PR intentionally uses the provider-enforced limit as the effective context window for Copilot, since that is what users can actually use. For models not in models.dev (like claude-opus-4.6-1m), this is the only data source.
  • Falls back gracefully: if the API call fails or returns no data, resolution continues to models.dev and other fallbacks as before.
  • No changes to token exchange, OAuth client ID, or enterprise endpoint handling (those will be separate PRs per [Bug]: Copilot provider uses hardcoded context windows and lacks token exchange — breaks account-specific models and enterprise endpoints #7731 discussion).

… resolver

The Copilot provider resolved context windows via models.dev static data,
which does not include account-specific models (e.g. claude-opus-4.6-1m
with 1M context). This adds the live Copilot /models API as a higher-
priority source for copilot/copilot-acp/github-copilot providers.

New helper get_copilot_model_context() in hermes_cli/models.py extracts
capabilities.limits.max_prompt_tokens from the cached catalog. Results
are cached in-process for 1 hour.

In agent/model_metadata.py, step 5a queries the live API before falling
through to models.dev (step 5b). This ensures account-specific models
get correct context windows while standard models still have a fallback.

Part 1 of NousResearch#7731.
Refs: NousResearch#7272
@difujia

difujia commented Apr 24, 2026

Copy link
Copy Markdown
Contributor Author

Cherry-picked into main via 7632919. Thanks!

@difujia difujia closed this Apr 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder comp/cli CLI entry point, hermes_cli/, setup wizard P2 Medium — degraded but workaround exists provider/copilot GitHub Copilot (ACP + Chat) type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants