fix: honor custom-provider context overrides in session info#10690
fix: honor custom-provider context overrides in session info#10690Bot1822 wants to merge 1 commit into
Conversation
9987334 to
4a581ac
Compare
|
Rebased this branch onto the latest Local verification after the rebase:
This still covers the original reset/session-info bug, plus the broader config-override cases in |
|
@teknium1 @austinpickett friendly ping on this one when you have a moment. This branch is now rebased onto the latest I also checked for repo-native AI review bots, but I'm not seeing an obvious configured reviewer/bot handle to ping here, so I'm just tagging human maintainers directly. |
…nner The session-reset / info banner in `_format_session_info` resolved the context window only from the top-level `model.context_length` key. When users configured context_length under the new `providers.<name>.models.<model>` dict schema (or the legacy `custom_providers[].models.<model>` list schema), the banner fell through to `get_model_context_length()`'s probe chain. Remote OpenAI-compatible proxies frequently omit `context_length` from `/models` responses, so the probe failed silently and the banner displayed `128K tokens (default — set model.context_length in config to override)` — even though the runtime `ContextCompressor` was already budgeting with the correct value resolved by `AIAgent.__init__`. After the top-level lookup, walk `get_compatible_custom_providers(cfg)`, match the entry by `base_url` against the resolved runtime base_url, and read `entry["models"][model]["context_length"]`. This mirrors the exact resolution `AIAgent.__init__` performs at `run_agent.py:1608-1643` so the banner's displayed value matches what the compressor actually uses. Fixes NousResearch#5089. Relationship to existing open PRs (NousResearch/hermes-agent): - NousResearch#5096, NousResearch#8240, NousResearch#10690: each hand-roll traversal of the legacy `custom_providers:` list only. They do not cover the newer `providers:` dict schema users land on via `hermes model`. - NousResearch#12380: touches `/model` display + gateway fallback paths; overlaps in spirit but still traverses lists manually. Routing through `get_compatible_custom_providers()` — the existing compat shim at `hermes_cli/config.py:2078` — gives a single lookup that covers both schemas and stays consistent with every other runtime caller, eliminating future drift between display and compressor.
4a581ac to
42e73e5
Compare
There was a problem hiding this comment.
Pull request overview
This PR centralizes and broadens resolution of custom-provider context_length overrides so that context windows shown in gateway session info and /api/model/info reflect implicit per-provider/per-model config overrides even when callers don’t pass model.context_length explicitly.
Changes:
- Extend
agent.model_metadata.get_model_context_length()to consultcustom_providers/providersconfig-derived overrides early in resolution. - Expand
hermes_cli.config.get_custom_provider_context_length()to match by provider identity (name/provider_key) and support provider-level fallbackcontext_length. - Update gateway session info and web
/api/model/infoto surface implicit config-derived overrides; add regression tests covering legacycustom_providersand v12providersconfigs.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
agent/model_metadata.py |
Adds config-aware custom-provider override lookup to the core context-length resolver. |
hermes_cli/config.py |
Enhances custom-provider context override lookup (provider-name matching + provider-level fallback). |
gateway/run.py |
Updates gateway /info session formatting to reuse implicit override resolution for displayed context length/source. |
hermes_cli/web_server.py |
Makes /api/model/info report implicit config override separately from auto-detected context length. |
tests/agent/test_model_metadata.py |
Adds regression tests ensuring overrides are honored even without explicit config_context_length. |
tests/gateway/test_session_info.py |
Adds tests asserting gateway session info shows correct config-derived context for both config schemas. |
tests/hermes_cli/test_custom_provider_context_length.py |
Adds unit tests for provider-key matching without base_url and provider-level fallback. |
tests/hermes_cli/test_web_server.py |
Adds endpoint test asserting implicit config override is reported and used as effective context. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| target_url = (base_url or "").rstrip("/") | ||
| if not target_url: | ||
| target_provider = (provider or "").strip().lower() | ||
| if not target_url and not target_provider: | ||
| return None | ||
|
|
||
| for entry in custom_providers: | ||
| if not isinstance(entry, dict): | ||
| continue | ||
| entry_url = (entry.get("base_url") or "").rstrip("/") | ||
| if not entry_url or entry_url != target_url: | ||
| entry_name = str(entry.get("name") or "").strip().lower() | ||
| entry_provider_key = str(entry.get("provider_key") or "").strip().lower() | ||
| url_matches = bool(target_url and entry_url and entry_url == target_url) | ||
| provider_matches = bool( | ||
| target_provider and target_provider in {entry_name, entry_provider_key} | ||
| ) | ||
| if not (url_matches or provider_matches): | ||
| continue |
| if config_context_length is None: | ||
| try: | ||
| from hermes_cli.config import get_custom_provider_context_length | ||
| config_context_length = get_custom_provider_context_length( | ||
| model=model, | ||
| base_url=base_url or "", | ||
| provider=provider or "", | ||
| custom_providers=custom_provs, | ||
| config=data, | ||
| ) |
Summary
get_model_context_length()honor named custom-provider context overrides even when callers do not passmodel.context_lengthexplicitly/resetshows the real context window for legacycustom_providersand v12providersconfigs/api/model/infosemantics by continuing to report auto-detected context separately from implicit config-derived overridesWhy
The original
fix/session-reset-custom-provider-contextbranch fixes the immediate gateway banner bug, but only insideGatewayRunner._format_session_info().That was too narrow and still left other call sites inconsistent with the main runtime path that already treats named custom-provider per-model
context_lengthas authoritative.This PR makes the fix more robust by centralizing the override lookup in
agent/model_metadata.py, then using it from gateway and web UI code without breaking the distinction between:Tests
venv/bin/python -m pytest tests/agent/test_model_metadata.py tests/gateway/test_session_info.py tests/hermes_cli/test_web_server.py -q