fix(model_metadata): respect context_length from custom_providers config with scoping#8005
fix(model_metadata): respect context_length from custom_providers config with scoping#8005fancydirty wants to merge 2 commits into
Conversation
…fig with scoping get_model_context_length() now checks custom_providers[].models[] for a config-driven context_length override before falling back to probing or defaults. The lookup is scoped to the active provider (by name or base_url) to avoid collisions between identically-named models on different endpoints. Adds 5 unit tests covering provider-name matching, base-url matching, unmatched-provider exclusion, single-provider fallback, and ambiguous-multi-provider skipping.
There was a problem hiding this comment.
Pull request overview
Fixes get_model_context_length() so it honors per-model context_length configured under custom_providers in config.yaml, with scoping to avoid collisions across different custom endpoints.
Changes:
- Add a config-driven
custom_providers[].models[model].context_lengthlookup early inget_model_context_length(), before any network I/O or hardcoded defaults. - Scope the lookup by
base_url(preferred) orproviderto prevent same-model-name leakage across providers/endpoints. - Add unit tests covering scoped matching, unscoped single-provider fallback, and ambiguous multi-provider behavior.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
agent/model_metadata.py |
Adds early config-based resolution of model context length from custom_providers, with scoping and ambiguity handling. |
tests/agent/test_model_metadata.py |
Adds tests validating the new custom_providers context length resolution behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| _normalized_base_url = base_url.rstrip("/") if isinstance(base_url, str) and base_url else None | ||
| _active_provider = str(provider).strip().lower() if isinstance(provider, str) and provider.strip() else None | ||
| _valid_entries = [cp for cp in _custom_providers if isinstance(cp, dict)] | ||
| for cp in _valid_entries: | ||
| _cp_base = str(cp.get("base_url", "")).strip().rstrip("/") | ||
| _cp_name = str(cp.get("name", "")).strip().lower() | ||
|
|
||
| if _normalized_base_url is not None and _cp_base: | ||
| if _cp_base != _normalized_base_url: | ||
| continue | ||
| elif _active_provider is not None and _cp_name: | ||
| if _cp_name != _active_provider: | ||
| continue |
There was a problem hiding this comment.
Provider-name scoping here compares provider (lowercased) directly to custom_providers[].name (lowercased). In practice, provider values elsewhere in the codebase can be menu keys like custom:<normalized-name> (see hermes_cli/runtime_provider.py’s _normalize_custom_provider_name) and/or normalize spaces to -, so this equality check will miss valid matches when base_url isn’t provided. Consider normalizing both sides consistently (e.g., strip a leading custom: from provider and apply the same space→dash normalization to cp['name']) before comparing.
| try: | ||
| from hermes_cli.config import load_config | ||
| _cfg = load_config() | ||
| _custom_providers = _cfg.get("custom_providers", []) if isinstance(_cfg, dict) else [] | ||
| if isinstance(_custom_providers, list): |
There was a problem hiding this comment.
This block calls hermes_cli.config.load_config() on every get_model_context_length() invocation, which does a deep-merge with defaults and reads from disk. Since this function can be called in hot paths (e.g., gateway request handling), consider using the lightweight read_raw_config() or caching the custom_providers lookup (e.g., via an internal module-level cache with a short TTL) to avoid repeated disk I/O and merges.
| _normalized_base_url = base_url.rstrip("/") if isinstance(base_url, str) and base_url else None | ||
| _active_provider = str(provider).strip().lower() if isinstance(provider, str) and provider.strip() else None | ||
| _valid_entries = [cp for cp in _custom_providers if isinstance(cp, dict)] | ||
| for cp in _valid_entries: | ||
| _cp_base = str(cp.get("base_url", "")).strip().rstrip("/") | ||
| _cp_name = str(cp.get("name", "")).strip().lower() |
There was a problem hiding this comment.
Base URL normalization is inconsistent: _normalized_base_url = base_url.rstrip('/') does not strip whitespace, while _cp_base uses .strip().rstrip('/'). Using .strip().rstrip('/') for base_url too would avoid surprising mismatches if base_url has leading/trailing spaces.
| # 0a. Config-driven overrides from custom_providers[].models[].context_length. | ||
| # When provider/base_url is known, scope the lookup to the matching custom | ||
| # provider entry to avoid collisions between identically-named models on | ||
| # different endpoints. When unknown, allow matching any custom provider. | ||
| try: | ||
| from hermes_cli.config import load_config | ||
| _cfg = load_config() | ||
| _custom_providers = _cfg.get("custom_providers", []) if isinstance(_cfg, dict) else [] | ||
| if isinstance(_custom_providers, list): | ||
| _normalized_base_url = base_url.rstrip("/") if isinstance(base_url, str) and base_url else None | ||
| _active_provider = str(provider).strip().lower() if isinstance(provider, str) and provider.strip() else None | ||
| _valid_entries = [cp for cp in _custom_providers if isinstance(cp, dict)] | ||
| for cp in _valid_entries: | ||
| _cp_base = str(cp.get("base_url", "")).strip().rstrip("/") | ||
| _cp_name = str(cp.get("name", "")).strip().lower() | ||
|
|
||
| if _normalized_base_url is not None and _cp_base: | ||
| if _cp_base != _normalized_base_url: | ||
| continue | ||
| elif _active_provider is not None and _cp_name: | ||
| if _cp_name != _active_provider: | ||
| continue | ||
| elif len(_valid_entries) > 1: | ||
| # Ambiguous: skip unscoped entries when multiple providers exist | ||
| continue |
There was a problem hiding this comment.
The comment says “When unknown, allow matching any custom provider,” but the implementation intentionally skips unscoped entries when multiple providers exist (and only effectively allows unscoped matching when there is a single custom provider). Update the comment to reflect the actual behavior to avoid future confusion.
| def test_ambiguous_multiple_providers_skipped_without_scope(self, mock_load_config): | ||
| """When multiple providers exist and no scope is given, skip to avoid collisions.""" | ||
| mock_load_config.return_value = { | ||
| "custom_providers": [ | ||
| { | ||
| "name": "zai-customize", | ||
| "models": {"glm-5.1": {"context_length": 200000}}, | ||
| }, | ||
| { | ||
| "name": "other-provider", | ||
| "models": {"glm-5.1": {"context_length": 128000}}, | ||
| }, | ||
| ] | ||
| } | ||
| # Must NOT pick either custom provider value because scope is ambiguous | ||
| result = get_model_context_length("glm-5.1") | ||
| assert result not in (200000, 128000) |
There was a problem hiding this comment.
test_ambiguous_multiple_providers_skipped_without_scope currently allows get_model_context_length() to fall through into fetch_model_metadata(), which can perform a real network call to OpenRouter (making this unit test slower/flaky depending on environment). Patch agent.model_metadata.fetch_model_metadata to return {} (or patch requests.get) so the test is fully deterministic and offline.
| result = get_model_context_length("glm-5.1", provider="zai-customize") | ||
| assert result == 200000 | ||
|
|
||
| @patch("hermes_cli.config.load_config") |
There was a problem hiding this comment.
Coverage gap: provider-name scoping is only tested with provider='zai-customize', but other code paths use provider keys like custom:<normalized-name> (e.g., custom:local-(127.0.0.1:4141)). Add a unit test that passes a provider string in that form and verifies the correct custom_providers entry is matched (or intentionally not matched) so the scoping behavior is validated against real runtime values.
| @patch("hermes_cli.config.load_config") | |
| @patch("hermes_cli.config.load_config") | |
| def test_match_by_runtime_custom_provider_key(self, mock_load_config): | |
| mock_load_config.return_value = { | |
| "custom_providers": [ | |
| { | |
| "name": "local-(127.0.0.1:4141)", | |
| "base_url": "http://127.0.0.1:4141/v1", | |
| "models": { | |
| "llama3.2": {"context_length": 8192}, | |
| }, | |
| } | |
| ] | |
| } | |
| result = get_model_context_length( | |
| "llama3.2", provider="custom:local-(127.0.0.1:4141)" | |
| ) | |
| assert result == 8192 | |
| @patch("hermes_cli.config.load_config") |
- Use read_raw_config() instead of load_config() to avoid deep-merge
overhead on every get_model_context_length() call.
- Normalize provider names consistently: strip custom: prefix and apply
the same space→dash normalization used by runtime_provider.py.
- Consistent base_url stripping: apply .strip() before .rstrip('/') to
avoid mismatches on leading/trailing whitespace.
- Update misleading comment to match actual behavior (unscoped matching
is only allowed when there is a single custom provider).
- Patch fetch_model_metadata in ambiguous-provider test so it stays
fully offline and deterministic.
- Add test for custom:<normalized-name> provider keys.
|
@copilot-pull-request-reviewer all review feedback addressed in b666fb0:
Ready for another pass. |
|
@copilot-pull-request-reviewer all review feedback addressed in
Ready for another pass. |
|
Hi @copilot-pull-request-reviewer, thanks for the review. These comments look like they reference an earlier revision. The latest commit (
Please re-review the latest commit. |
|
Thanks for the contribution! The problem you identified is real — custom_providers context_length does need to be respected — but the fix is already handled at the caller level. Both The approach of adding Appreciate you digging into the custom_providers context resolution! |
Problem
When a user declares a
context_lengthfor a model insidecustom_providers[].models[]inconfig.yaml(e.g.glm-5.1viazai-customize),get_model_context_length()ignored it and fell through to probing or the 128K default. This caused false "compression model context too small" warnings and disabled context compression.Fix
Teach
get_model_context_length()to checkcustom_providers[].models[].context_lengthimmediately after explicit argument overrides, before any network I/O or hardcoded defaults. The lookup is scoped to the active custom provider (matched byprovidername orbase_url) so identically-named models on different endpoints do not leak across providers.Changes
agent/model_metadata.py: add config-driven lookup with provider/base_url scoping.tests/agent/test_model_metadata.py: add 5 unit tests for provider-name matching, base-url matching, unmatched-provider exclusion, single-provider fallback, and ambiguous-multi-provider skipping.Verification