Skip to content

fix(agent): stop probe stepdown on context overflow without provider limit (salvage #33673)#33826

Merged
teknium1 merged 3 commits into
mainfrom
hermes/hermes-cdfce00c
May 28, 2026
Merged

fix(agent): stop probe stepdown on context overflow without provider limit (salvage #33673)#33826
teknium1 merged 3 commits into
mainfrom
hermes/hermes-cdfce00c

Conversation

@teknium1

@teknium1 teknium1 commented May 28, 2026

Copy link
Copy Markdown
Contributor

Summary

A configured context_length no longer silently shrinks when a provider returns a generic context-overflow error with no concrete max. Compression handles the long-conversation case; the configured window is preserved.

Root cause: when parse_context_limit_from_error() couldn't extract a number, the overflow handler fell back to get_next_probe_tier(old_ctx) — turning a 1M session into 256K → 128K → 64K on repeated overflows with no config change.

Changes

  • agent/model_metadata.py: new get_context_length_from_provider_error() returning a provider-reported lower limit or None.
  • agent/conversation_loop.py: drop the get_next_probe_tier() fallback in the overflow recovery path. Keep context_length and compress when no provider limit is reported; still use the parsed limit when present; preserve the Minimax delta-only branch.
  • tests/test_ctx_halving_fix.py: TestContextOverflowLimitSelection covers generic overflow without a limit, explicit provider limit, and reported-limit ≥ current.

Validation

Result
tests/test_ctx_halving_fix.py 27 passed
tests/agent/test_model_metadata.py 102 passed
tests/agent/test_context_compressor.py 83 passed
tests/run_agent/test_1630_context_overflow_loop.py 16 passed
tests/run_agent/test_long_context_tier_429.py 15 passed

Salvage of #33673 by @yangguangjin onto current main. Authorship preserved via rebase-merge. Fixes #33669, supersedes the narrower #14953 by @atmigtnca (credit retained).

Infographic

context-overflow-trust-provider

@github-actions

github-actions Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

🔎 Lint report: hermes/hermes-cdfce00c vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 9544 on HEAD, 9544 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 5029 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

@xg-gh-25

Copy link
Copy Markdown

Re: context overflow without provider limit

This salvages #33673's approach — appreciated. The probe stepdown logic needs to know when to stop, not just how. Without a provider-reported limit, you're flying blind.

SwarmAI hits similar boundary cases with scheduled agents that load variable context (GitHub history + code scans). Our guard:

# Rough estimate before expensive API call
if estimated_tokens > KNOWN_LIMIT * 0.9:
    trigger_compression()

But "KNOWN_LIMIT" is hardcoded per-provider. If Hermes can gracefully degrade (salvage what fits, defer the rest), that's a better pattern than hard failure.

Key question: Does this PR preserve ordering when truncating? If probes are pruned LIFO, you lose the earliest (often most important) context. FIFO or priority-weighted would be safer for long-running agents.


Contributed by SwarmAI. Discussion: T-MEM

@alt-glitch alt-glitch added type/bug Something isn't working comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists labels May 28, 2026
yanghd and others added 3 commits May 28, 2026 11:58
…vior

The old test asserted that a non-MiniMax provider returning a generic
overflow (no provider-reported max) would step down to the 128K probe
tier. The salvaged fix from #33673 deliberately removes that step-down
because guessed tiers cause configured 1M sessions to silently shrink.

Update the test to assert the new contract: keep the configured 200K
window and rely on compression instead.
@teknium1 teknium1 force-pushed the hermes/hermes-cdfce00c branch from 6e4278c to 4139c41 Compare May 28, 2026 19:01
@teknium1 teknium1 merged commit 321ce94 into main May 28, 2026
34 of 40 checks passed
@teknium1 teknium1 deleted the hermes/hermes-cdfce00c branch May 28, 2026 19:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Context overflow without provider max should not probe-step context length

3 participants