Skip to content

fix(model_metadata): respect context_length from custom_providers config with scoping#8005

Closed
fancydirty wants to merge 2 commits into
NousResearch:mainfrom
fancydirty:fix/model-metadata-custom-context-v2
Closed

fix(model_metadata): respect context_length from custom_providers config with scoping#8005
fancydirty wants to merge 2 commits into
NousResearch:mainfrom
fancydirty:fix/model-metadata-custom-context-v2

Conversation

@fancydirty

Copy link
Copy Markdown
Contributor

Problem

When a user declares a context_length for a model inside custom_providers[].models[] in config.yaml (e.g. glm-5.1 via zai-customize), get_model_context_length() ignored it and fell through to probing or the 128K default. This caused false "compression model context too small" warnings and disabled context compression.

Fix

Teach get_model_context_length() to check custom_providers[].models[].context_length immediately after explicit argument overrides, before any network I/O or hardcoded defaults. The lookup is scoped to the active custom provider (matched by provider name or base_url) so identically-named models on different endpoints do not leak across providers.

Changes

  • agent/model_metadata.py: add config-driven lookup with provider/base_url scoping.
  • tests/agent/test_model_metadata.py: add 5 unit tests for provider-name matching, base-url matching, unmatched-provider exclusion, single-provider fallback, and ambiguous-multi-provider skipping.

Verification

cd /Users/mac/.hermes/hermes-agent && source venv/bin/activate && python -m pytest tests/agent/test_model_metadata.py -q
# 85 passed

…fig with scoping

get_model_context_length() now checks custom_providers[].models[]
for a config-driven context_length override before falling back to
probing or defaults. The lookup is scoped to the active provider
(by name or base_url) to avoid collisions between identically-named
models on different endpoints.

Adds 5 unit tests covering provider-name matching, base-url matching,
unmatched-provider exclusion, single-provider fallback, and
ambiguous-multi-provider skipping.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes get_model_context_length() so it honors per-model context_length configured under custom_providers in config.yaml, with scoping to avoid collisions across different custom endpoints.

Changes:

  • Add a config-driven custom_providers[].models[model].context_length lookup early in get_model_context_length(), before any network I/O or hardcoded defaults.
  • Scope the lookup by base_url (preferred) or provider to prevent same-model-name leakage across providers/endpoints.
  • Add unit tests covering scoped matching, unscoped single-provider fallback, and ambiguous multi-provider behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File Description
agent/model_metadata.py Adds early config-based resolution of model context length from custom_providers, with scoping and ambiguity handling.
tests/agent/test_model_metadata.py Adds tests validating the new custom_providers context length resolution behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread agent/model_metadata.py Outdated
Comment on lines +951 to +963
_normalized_base_url = base_url.rstrip("/") if isinstance(base_url, str) and base_url else None
_active_provider = str(provider).strip().lower() if isinstance(provider, str) and provider.strip() else None
_valid_entries = [cp for cp in _custom_providers if isinstance(cp, dict)]
for cp in _valid_entries:
_cp_base = str(cp.get("base_url", "")).strip().rstrip("/")
_cp_name = str(cp.get("name", "")).strip().lower()

if _normalized_base_url is not None and _cp_base:
if _cp_base != _normalized_base_url:
continue
elif _active_provider is not None and _cp_name:
if _cp_name != _active_provider:
continue

Copilot AI Apr 12, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Provider-name scoping here compares provider (lowercased) directly to custom_providers[].name (lowercased). In practice, provider values elsewhere in the codebase can be menu keys like custom:<normalized-name> (see hermes_cli/runtime_provider.py’s _normalize_custom_provider_name) and/or normalize spaces to -, so this equality check will miss valid matches when base_url isn’t provided. Consider normalizing both sides consistently (e.g., strip a leading custom: from provider and apply the same space→dash normalization to cp['name']) before comparing.

Copilot uses AI. Check for mistakes.
Comment thread agent/model_metadata.py
Comment on lines +946 to +950
try:
from hermes_cli.config import load_config
_cfg = load_config()
_custom_providers = _cfg.get("custom_providers", []) if isinstance(_cfg, dict) else []
if isinstance(_custom_providers, list):

Copilot AI Apr 12, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block calls hermes_cli.config.load_config() on every get_model_context_length() invocation, which does a deep-merge with defaults and reads from disk. Since this function can be called in hot paths (e.g., gateway request handling), consider using the lightweight read_raw_config() or caching the custom_providers lookup (e.g., via an internal module-level cache with a short TTL) to avoid repeated disk I/O and merges.

Copilot uses AI. Check for mistakes.
Comment thread agent/model_metadata.py Outdated
Comment on lines +951 to +956
_normalized_base_url = base_url.rstrip("/") if isinstance(base_url, str) and base_url else None
_active_provider = str(provider).strip().lower() if isinstance(provider, str) and provider.strip() else None
_valid_entries = [cp for cp in _custom_providers if isinstance(cp, dict)]
for cp in _valid_entries:
_cp_base = str(cp.get("base_url", "")).strip().rstrip("/")
_cp_name = str(cp.get("name", "")).strip().lower()

Copilot AI Apr 12, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Base URL normalization is inconsistent: _normalized_base_url = base_url.rstrip('/') does not strip whitespace, while _cp_base uses .strip().rstrip('/'). Using .strip().rstrip('/') for base_url too would avoid surprising mismatches if base_url has leading/trailing spaces.

Copilot uses AI. Check for mistakes.
Comment thread agent/model_metadata.py
Comment on lines +942 to +966
# 0a. Config-driven overrides from custom_providers[].models[].context_length.
# When provider/base_url is known, scope the lookup to the matching custom
# provider entry to avoid collisions between identically-named models on
# different endpoints. When unknown, allow matching any custom provider.
try:
from hermes_cli.config import load_config
_cfg = load_config()
_custom_providers = _cfg.get("custom_providers", []) if isinstance(_cfg, dict) else []
if isinstance(_custom_providers, list):
_normalized_base_url = base_url.rstrip("/") if isinstance(base_url, str) and base_url else None
_active_provider = str(provider).strip().lower() if isinstance(provider, str) and provider.strip() else None
_valid_entries = [cp for cp in _custom_providers if isinstance(cp, dict)]
for cp in _valid_entries:
_cp_base = str(cp.get("base_url", "")).strip().rstrip("/")
_cp_name = str(cp.get("name", "")).strip().lower()

if _normalized_base_url is not None and _cp_base:
if _cp_base != _normalized_base_url:
continue
elif _active_provider is not None and _cp_name:
if _cp_name != _active_provider:
continue
elif len(_valid_entries) > 1:
# Ambiguous: skip unscoped entries when multiple providers exist
continue

Copilot AI Apr 12, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says “When unknown, allow matching any custom provider,” but the implementation intentionally skips unscoped entries when multiple providers exist (and only effectively allows unscoped matching when there is a single custom provider). Update the comment to reflect the actual behavior to avoid future confusion.

Copilot uses AI. Check for mistakes.
Comment thread tests/agent/test_model_metadata.py Outdated
Comment on lines +792 to +808
def test_ambiguous_multiple_providers_skipped_without_scope(self, mock_load_config):
"""When multiple providers exist and no scope is given, skip to avoid collisions."""
mock_load_config.return_value = {
"custom_providers": [
{
"name": "zai-customize",
"models": {"glm-5.1": {"context_length": 200000}},
},
{
"name": "other-provider",
"models": {"glm-5.1": {"context_length": 128000}},
},
]
}
# Must NOT pick either custom provider value because scope is ambiguous
result = get_model_context_length("glm-5.1")
assert result not in (200000, 128000)

Copilot AI Apr 12, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_ambiguous_multiple_providers_skipped_without_scope currently allows get_model_context_length() to fall through into fetch_model_metadata(), which can perform a real network call to OpenRouter (making this unit test slower/flaky depending on environment). Patch agent.model_metadata.fetch_model_metadata to return {} (or patch requests.get) so the test is fully deterministic and offline.

Copilot uses AI. Check for mistakes.
Comment thread tests/agent/test_model_metadata.py Outdated
result = get_model_context_length("glm-5.1", provider="zai-customize")
assert result == 200000

@patch("hermes_cli.config.load_config")

Copilot AI Apr 12, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coverage gap: provider-name scoping is only tested with provider='zai-customize', but other code paths use provider keys like custom:<normalized-name> (e.g., custom:local-(127.0.0.1:4141)). Add a unit test that passes a provider string in that form and verifies the correct custom_providers entry is matched (or intentionally not matched) so the scoping behavior is validated against real runtime values.

Suggested change
@patch("hermes_cli.config.load_config")
@patch("hermes_cli.config.load_config")
def test_match_by_runtime_custom_provider_key(self, mock_load_config):
mock_load_config.return_value = {
"custom_providers": [
{
"name": "local-(127.0.0.1:4141)",
"base_url": "http://127.0.0.1:4141/v1",
"models": {
"llama3.2": {"context_length": 8192},
},
}
]
}
result = get_model_context_length(
"llama3.2", provider="custom:local-(127.0.0.1:4141)"
)
assert result == 8192
@patch("hermes_cli.config.load_config")

Copilot uses AI. Check for mistakes.
- Use read_raw_config() instead of load_config() to avoid deep-merge
  overhead on every get_model_context_length() call.
- Normalize provider names consistently: strip custom: prefix and apply
  the same space→dash normalization used by runtime_provider.py.
- Consistent base_url stripping: apply .strip() before .rstrip('/') to
  avoid mismatches on leading/trailing whitespace.
- Update misleading comment to match actual behavior (unscoped matching
  is only allowed when there is a single custom provider).
- Patch fetch_model_metadata in ambiguous-provider test so it stays
  fully offline and deterministic.
- Add test for custom:<normalized-name> provider keys.
@fancydirty

Copy link
Copy Markdown
Contributor Author

@copilot-pull-request-reviewer all review feedback addressed in b666fb0:

  1. Provider-name normalization: Both sides now use from (strip prefix, spaces → dashes, lowercase). This matches how provider keys are built at runtime.
  2. Performance: Replaced with lightweight to avoid deep-merge + disk I/O on every call.
  3. Base URL consistency: now applies , same as .
  4. Comment accuracy: Updated the misleading comment to reflect that unscoped matching is only allowed when there is a single custom provider.
  5. Deterministic test: now patches to return , preventing any real network call.
  6. Coverage gap: Added for .

Ready for another pass.

@fancydirty

Copy link
Copy Markdown
Contributor Author

@copilot-pull-request-reviewer all review feedback addressed in b666fb0e:

  1. Provider-name normalization: Both sides now use _normalize_custom_provider_name() from runtime_provider.py (strip custom: prefix, spaces → dashes, lowercase). This matches how provider keys are built at runtime.
  2. Performance: Replaced load_config() with lightweight read_raw_config() to avoid deep-merge + disk I/O on every get_model_context_length() call.
  3. Base URL consistency: _normalized_base_url now applies .strip().rstrip('/'), same as _cp_base.
  4. Comment accuracy: Updated the misleading comment to reflect that unscoped matching is only allowed when there is a single custom provider.
  5. Deterministic test: test_ambiguous_multiple_providers_skipped_without_scope now patches fetch_model_metadata to return {}, preventing any real network call.
  6. Coverage gap: Added test_match_by_runtime_custom_provider_key for provider="custom:local-(127.0.0.1:4141)".

Ready for another pass.

@fancydirty

Copy link
Copy Markdown
Contributor Author

Hi @copilot-pull-request-reviewer, thanks for the review.

These comments look like they reference an earlier revision. The latest commit (b666fb0e) already addresses all six points:

  1. Provider-name scoping now uses _normalize_custom_provider_name() from runtime_provider.py.
  2. Switched from load_config() to read_raw_config() for performance.
  3. Base URL normalization is now consistently .strip().rstrip('/').
  4. The misleading comment about unscoped matching was corrected.
  5. fetch_model_metadata is patched in the ambiguous-provider test.
  6. Added test_match_by_runtime_custom_provider_key for the custom: prefix.

Please re-review the latest commit.

@teknium1

Copy link
Copy Markdown
Contributor

Thanks for the contribution! The problem you identified is real — custom_providers context_length does need to be respected — but the fix is already handled at the caller level.

Both run_agent.py (lines 1268-1292) and gateway/run.py (lines 3325-3351) already extract custom_providers[].models[].context_length and pass it as config_context_length to get_model_context_length(), which is step 0 in the resolution chain. The critical paths (agent init, gateway compression hygiene) work correctly for custom providers today.

The approach of adding read_raw_config() inside model_metadata.py introduces a layering concern (agent internals importing from CLI modules) and adds per-call disk I/O for something the callers already have in memory. There are a few secondary call sites (model switch display, @ references) that don't pass config_context_length, but those are informational — if they need fixing, the established pattern is to have the caller pass it through rather than reaching into the config system from inside the resolution function.

Appreciate you digging into the custom_providers context resolution!

@teknium1 teknium1 closed this Apr 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants