fix(agent): stop probe stepdown on context overflow without provider limit (salvage #33673) by teknium1 · Pull Request #33826 · NousResearch/hermes-agent

teknium1 · 2026-05-28T10:42:57Z

Summary

A configured context_length no longer silently shrinks when a provider returns a generic context-overflow error with no concrete max. Compression handles the long-conversation case; the configured window is preserved.

Root cause: when parse_context_limit_from_error() couldn't extract a number, the overflow handler fell back to get_next_probe_tier(old_ctx) — turning a 1M session into 256K → 128K → 64K on repeated overflows with no config change.

Changes

agent/model_metadata.py: new get_context_length_from_provider_error() returning a provider-reported lower limit or None.
agent/conversation_loop.py: drop the get_next_probe_tier() fallback in the overflow recovery path. Keep context_length and compress when no provider limit is reported; still use the parsed limit when present; preserve the Minimax delta-only branch.
tests/test_ctx_halving_fix.py: TestContextOverflowLimitSelection covers generic overflow without a limit, explicit provider limit, and reported-limit ≥ current.

Validation

	Result
tests/test_ctx_halving_fix.py	27 passed
tests/agent/test_model_metadata.py	102 passed
tests/agent/test_context_compressor.py	83 passed
tests/run_agent/test_1630_context_overflow_loop.py	16 passed
tests/run_agent/test_long_context_tier_429.py	15 passed

Salvage of #33673 by @yangguangjin onto current main. Authorship preserved via rebase-merge. Fixes #33669, supersedes the narrower #14953 by @atmigtnca (credit retained).

Infographic

github-actions · 2026-05-28T10:43:45Z

🔎 Lint report: `hermes/hermes-cdfce00c` vs `origin/main`

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 9544 on HEAD, 9544 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 5029 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

xg-gh-25 · 2026-05-28T10:49:23Z

Re: context overflow without provider limit

This salvages #33673's approach — appreciated. The probe stepdown logic needs to know when to stop, not just how. Without a provider-reported limit, you're flying blind.

SwarmAI hits similar boundary cases with scheduled agents that load variable context (GitHub history + code scans). Our guard:

# Rough estimate before expensive API call
if estimated_tokens > KNOWN_LIMIT * 0.9:
    trigger_compression()

But "KNOWN_LIMIT" is hardcoded per-provider. If Hermes can gracefully degrade (salvage what fits, defer the rest), that's a better pattern than hard failure.

Key question: Does this PR preserve ordering when truncating? If probes are pruned LIFO, you lose the earliest (often most important) context. FIFO or priority-weighted would be safer for long-running agents.

Contributed by SwarmAI. Discussion: T-MEM

…vior The old test asserted that a non-MiniMax provider returning a generic overflow (no provider-reported max) would step down to the 128K probe tier. The salvaged fix from #33673 deliberately removes that step-down because guessed tiers cause configured 1M sessions to silently shrink. Update the test to assert the new contract: keep the configured 200K window and rely on compression instead.

alt-glitch added type/bug Something isn't working comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists labels May 28, 2026

yanghd and others added 3 commits May 28, 2026 11:58

fix: stop probe stepdown without provider context limit

f353e6e

chore: map yanghongda@jackyun.com -> yangguangjin in AUTHOR_MAP

1c6cfff

teknium1 force-pushed the hermes/hermes-cdfce00c branch from 6e4278c to 4139c41 Compare May 28, 2026 19:01

teknium1 merged commit 321ce94 into main May 28, 2026
34 of 40 checks passed

teknium1 deleted the hermes/hermes-cdfce00c branch May 28, 2026 19:26

This was referenced May 28, 2026

fix: stop probe stepdown without provider context limit #33673

Closed

fix: preserve explicit context window on generic overflow #14953

Closed

BrewTestBot mentioned this pull request May 29, 2026

hermes-agent 2026.5.29 Homebrew/homebrew-core#285204

Merged

1 task

github-actions Bot mentioned this pull request May 29, 2026

chore: bump NousResearch/hermes-agent version from v2026.5.16 to v2026.5.29 Docker-Hub-sirmark/docker-hermes-agent#7

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(agent): stop probe stepdown on context overflow without provider limit (salvage #33673)#33826

fix(agent): stop probe stepdown on context overflow without provider limit (salvage #33673)#33826
teknium1 merged 3 commits into
mainfrom
hermes/hermes-cdfce00c

teknium1 commented May 28, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 28, 2026 •

edited

Loading

Uh oh!

xg-gh-25 commented May 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

teknium1 commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Validation

Infographic

Uh oh!

github-actions Bot commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔎 Lint report: hermes/hermes-cdfce00c vs origin/main

ruff

ty (type checker)

Uh oh!

xg-gh-25 commented May 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

teknium1 commented May 28, 2026 •

edited

Loading

github-actions Bot commented May 28, 2026 •

edited

Loading

🔎 Lint report: `hermes/hermes-cdfce00c` vs `origin/main`