Skip to content

fix: preserve Ollama model:tag colons in context length detection#2147

Closed
kshitijk4poor wants to merge 3 commits into
NousResearch:mainfrom
kshitijk4poor:fix/ollama-model-tag-colon-parsing
Closed

fix: preserve Ollama model:tag colons in context length detection#2147
kshitijk4poor wants to merge 3 commits into
NousResearch:mainfrom
kshitijk4poor:fix/ollama-model-tag-colon-parsing

Conversation

@kshitijk4poor

@kshitijk4poor kshitijk4poor commented Mar 20, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • The colon-split logic in get_model_context_length() and _query_local_context_length() (agent/model_metadata.py) assumed any colon in a model name meant provider:model format (e.g. local:my-model). Ollama uses model:tag format (e.g. qwen3.5:27b), so the split turned qwen3.5:27b into just 27b — which matches nothing in the defaults or cache, causing a fallback to the 2M token probe tier.
  • Added a _strip_provider_prefix() helper that only strips recognised provider names (matching the set already used by parse_model_input in hermes_cli/models.py). Ollama-style model:tag names now pass through intact.
  • Both affected call sites (get_model_context_length line ~693 and _query_local_context_length line ~584) now use the safe helper.

Test plan

  • Added TestStripProviderPrefix with 5 test cases covering:
    • Known provider prefixes are stripped (local:my-modelmy-model)
    • Ollama model:tag format is preserved (qwen3.5:27b, llama3.3:70b, gemma2:9b, codellama:13b-instruct-q4_0)
    • HTTP URLs are preserved
    • No-colon model names pass through unchanged
    • Integration test: qwen3.5:27b reaches endpoint metadata lookup with full name intact
  • All 94 model metadata tests pass
  • Full test suite: 5546 passed (8 pre-existing failures unrelated to this change)

The colon-split logic in get_model_context_length() and
_query_local_context_length() assumed any colon meant provider:model
format (e.g. "local:my-model"). But Ollama uses model:tag format
(e.g. "qwen3.5:27b"), so the split turned "qwen3.5:27b" into just
"27b" — which matches nothing, causing a fallback to the 2M token
probe tier.

Now only recognised provider prefixes (local, openrouter, anthropic,
etc.) are stripped. Ollama model:tag names pass through intact.
- Replace double split(":", 1) with single partition() call
- Add comment noting _PROVIDER_PREFIXES mirrors hermes_cli/models.py
  and must be kept in sync when adding new providers
@teknium1

Copy link
Copy Markdown
Contributor

Merged via PR #2149 — your commit was cherry-picked onto current main with authorship preserved. Thanks for the fix @kshitijk4poor!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants