Skip to content

fix: register Ollama Cloud as known provider for context length resolution#5490

Closed
LucidPaths wants to merge 1 commit into
NousResearch:mainfrom
LucidPaths:fix/ollama-cloud-context-length
Closed

fix: register Ollama Cloud as known provider for context length resolution#5490
LucidPaths wants to merge 1 commit into
NousResearch:mainfrom
LucidPaths:fix/ollama-cloud-context-length

Conversation

@LucidPaths

Copy link
Copy Markdown
Contributor

What does this PR do?

Registers Ollama Cloud (ollama.com/v1) as a recognized provider in the URL-to-provider and models.dev provider mappings. Without this, all Ollama Cloud models silently default to 128K context length instead of their actual capacity.

The existing get_model_context_length() resolution chain (step 5: provider-aware models.dev lookup) already handles this correctly for every other cloud provider — Ollama Cloud was simply missing from the two mapping dicts that feed it.

Related Issue

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)

Changes Made

  • agent/model_metadata.py: Added "ollama.com": "ollama-cloud" to _URL_TO_PROVIDER dict — enables _infer_provider_from_url() to recognize Ollama Cloud base URLs.
  • agent/models_dev.py: Added "ollama-cloud": "ollama-cloud" to PROVIDER_TO_MODELS_DEV dict — enables lookup_models_dev_context() to resolve model context lengths from the models.dev cache.

Impact

Affects all users on Ollama Cloud (Free, Pro, Max tiers) who use https://ollama.com/v1 as their base URL. Before this fix, every model on that endpoint reported 128K context regardless of actual capacity:

Model Actual context Was reported
qwen3.5:397b 256K 128K
glm-5 202K 128K
kimi-k2.5 262K 128K
deepseek-v3.2 65K 128K (overcounted)

This caused the status bar to display wrong values and — more critically — could lead to premature context compression or missed compression triggers.

How to Test

  1. Configure Ollama Cloud: set OLLAMA_API_KEY and use base_url: https://ollama.com/v1
  2. Switch to any Ollama Cloud model (e.g. /model qwen3.5:397b --provider custom)
  3. Check status bar — should show ~256K context, not 128K
  4. Run existing tests: pytest tests/agent/test_model_metadata.py tests/agent/test_models_dev.py -v

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix/feature (no unrelated commits)
  • I've run pytest tests/ -q and all tests pass
  • I've added tests for my changes (required for bug fixes, strongly encouraged for features) — The fix adds dict entries consumed by existing tested code paths. Adding a test for a specific URL→provider mapping entry would test the data, not the logic.
  • I've tested on my platform: Ubuntu 24.04 (WSL2)

Documentation & Housekeeping

  • I've updated relevant documentation (README, docs/, docstrings) — N/A (no new config keys or user-facing API changes)
  • I've updated cli-config.yaml.example if I added/changed config keys — N/A
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — N/A
  • I've considered cross-platform impact (Windows, macOS) per the compatibility guide — N/A (pure dict additions, no platform-specific code)
  • I've updated tool descriptions/schemas if I changed tool behavior — N/A

…olution

Ollama Cloud (ollama.com/v1) was not registered in the URL-to-provider
mapping or the models.dev provider mapping. This caused all Ollama Cloud
models to fall through to the 128K default context length, regardless of
actual model capability (e.g. qwen3.5:397b has 256K, glm-5 has 202K).

Two one-line additions:
- _URL_TO_PROVIDER: map ollama.com -> ollama-cloud
- PROVIDER_TO_MODELS_DEV: map ollama-cloud -> ollama-cloud

This enables the existing models.dev cache lookup (step 5 in
get_model_context_length) to resolve correct context lengths for all
Ollama Cloud models.
@teknium1

Copy link
Copy Markdown
Contributor

Subsumed by PR #10782 which includes the same ollama.com URL-to-provider and models.dev mappings as part of the full Ollama Cloud provider integration (#6038). Your analysis of the context length resolution gap was spot on — thanks for the clear PR description.

@teknium1 teknium1 closed this Apr 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants