Skip to content

feat: read context_length from providers.<name>.context_length (Step 0c)#20847

Open
qwjzl wants to merge 1 commit into
NousResearch:mainfrom
qwjzl:feat/provider-context-length
Open

feat: read context_length from providers.<name>.context_length (Step 0c)#20847
qwjzl wants to merge 1 commit into
NousResearch:mainfrom
qwjzl:feat/provider-context-length

Conversation

@qwjzl

@qwjzl qwjzl commented May 6, 2026

Copy link
Copy Markdown

Summary

Add Step 0c to get_model_context_length() that reads per-provider context_length overrides from providers.<name>.context_length in config.yaml.

Problem

Custom endpoints (e.g. llama-server) rarely expose context_length via /models probing, causing Hermes to fall back to artificially low values (e.g. 32K). This blocks cron jobs and any scenario where the model's true context window (e.g. 64K) can't be auto-detected.

The existing Step 0b (custom_providers per-model override) requires base_url matching, which is unavailable for cron jobs — their base_url is resolved later in the AIAgent init sequence.

Solution

Step 0c reads providers.<name>.context_length directly from config.yaml, independently of the custom_providers list. It also handles the custom provider alias — when the full provider name (e.g. custom:qwj-local) gets normalized to bare custom during AIAgent init, it falls back to prefix-matching providers.custom:* keys.

Test Plan

  • Direct function test: get_model_context_length() with provider=custom:qwj-local and config_context_length=None returns 65536 (from providers.custom:qwj-local.context_length: 65536)
  • Cron job test: previously failing with 32K error, now runs successfully with Qwen 27B (64K context)
  • Non-custom providers unaffected: DeepSeek continues to auto-detect 1M context
  • Fallback intact: if no provider-level config, falls through to existing probing chain

Configuration Example

providers:
  custom:qwj-local:
    base_url: http://192.168.0.68:18081/v1
    api_key: dummy
    context_length: 65536

Add Step 0c to get_model_context_length() that reads per-provider
context_length overrides from config.yaml.

Custom endpoints (e.g. llama-server) rarely expose context_length via
/models probing, leading to artificially low fallback values (e.g. 32K).
This reads the per-provider override from providers.<name>.context_length
independently of the custom_providers list (which requires base_url
matching — unavailable for cron jobs whose base_url is resolved later).

Also handles the 'custom' provider alias — when the full provider name
(e.g. 'custom:qwj-local') is normalized to bare 'custom' during AIAgent
init, fall back to prefix-matching providers.custom:* keys.
@alt-glitch alt-glitch added type/feature New feature or request P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder area/config Config system, migrations, profiles labels May 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/config Config system, migrations, profiles comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants