Skip to content

fix(agent): preserve MiniMax context length on delta-only overflow errors#9170

Closed
maelrx wants to merge 1 commit into
NousResearch:mainfrom
maelrx:fix/minimax-delta-overflow-context
Closed

fix(agent): preserve MiniMax context length on delta-only overflow errors#9170
maelrx wants to merge 1 commit into
NousResearch:mainfrom
maelrx:fix/minimax-delta-overflow-context

Conversation

@maelrx

@maelrx maelrx commented Apr 13, 2026

Copy link
Copy Markdown
Contributor

Summary

This fixes a MiniMax-specific context overflow recovery bug in AIAgent.

When MiniMax's Anthropic-compatible endpoint returns an error like:

context window exceeds limit (2013)

The number in parentheses is only the overflow delta, not the actual context window. Hermes already knows MiniMax's real context length, but the recovery path treated this error as "no limit available" and incorrectly probed down to the next generic tier.

This PR keeps the known MiniMax context length intact for that provider-specific error format and compresses the conversation without demoting the model to a smaller inferred window.

Problem

Hermes handles context overflow in run_agent.py by:

  1. trying to parse the real limit from the provider error message
  2. if no real limit can be parsed, probing down via get_next_probe_tier(old_ctx)

That fallback is correct for genuinely unknown providers or endpoints, but it is wrong for MiniMax's delta-only overflow message.

In practice, this caused:

  • known MiniMax context length: 204800
  • fallback probe tier: 128000
  • earlier compression than intended
  • unnecessary reduction of the working context window for the rest of the turn

Because Hermes's in-loop ContextCompressor defaults to a 50% threshold, this effectively moved the compression trigger from 102400 tokens to 64000 tokens for the affected path.

Root Cause

The root cause was not the generic parser.

parse_context_limit_from_error() correctly returns None for:

context window exceeds limit (2013)

because the message does not contain the actual context window.

The real bug was in the recovery logic: AIAgent treated "no parsed limit" as a reason to probe down even when the provider was MiniMax and the model context was already known.

What Changed

1. MiniMax-specific recovery guard in run_agent.py

For MiniMax and MiniMax China provider resolution, and for MiniMax Anthropic-compatible base URLs, Hermes now detects the delta-only overflow format and:

  • keeps context_length unchanged
  • skips get_next_probe_tier()
  • compresses and retries normally

2. Parser behavior remains generic

I did not change the generic parser semantics for small valid context windows.

This is important because Hermes already supports valid provider error formats that may report context limits such as:

  • 32768
  • 4096

Those should continue to work as before.

3. Regression coverage added

Added tests to cover:

  • MiniMax delta-only overflow message returns None from the parser
  • MiniMax keeps 204800 and compresses instead of probing down
  • non-MiniMax providers still keep the generic probe-down behavior for the same text pattern

Why This Approach

This repo's architecture separates:

  • generic metadata parsing
  • provider runtime resolution
  • provider-specific recovery inside the main agent loop

Given that provider runtime selection is shared across CLI, gateway, cron, ACP, and auxiliary tasks, scoping this behavior to MiniMax is safer than weakening the generic overflow logic globally.

This keeps the PR focused and avoids regressions for other providers that may still need probe-down behavior when the real limit is unknown.

Files Changed

  • run_agent.py
  • tests/agent/test_model_metadata.py
  • tests/run_agent/test_run_agent.py

Testing

Targeted tests run locally on Windows with Python 3.12:

.venv\\Scripts\\python -m pytest -q tests/agent/test_model_metadata.py tests/run_agent/test_run_agent.py -k "minimax_delta_only_message_returns_none or minimax_delta_overflow_keeps_known_context_length or non_minimax_delta_overflow_still_probes_down or glm_prompt_exceeds_max_length_triggers_compression"
.venv\\Scripts\\python -m pytest -q tests/agent/test_error_classifier.py tests/agent/test_minimax_provider.py tests/run_agent/test_1630_context_overflow_loop.py tests/run_agent/test_413_compression.py

Results:

  • targeted MiniMax/parser/recovery coverage passed
  • related overflow/classifier/MiniMax suites passed

Scope / Non-Goals

This PR does not:

  • change generic context limit parsing rules
  • change gateway hygiene thresholds
  • change session storage behavior
  • change provider runtime resolution outside the MiniMax-specific recovery path

Rationale

This follows Hermes's existing pattern of making provider and error-class-specific recovery decisions inside AIAgent rather than broadening a generic parser in ways that could regress other providers.

@RuckVibeCodes

Copy link
Copy Markdown

[gus-first-pass] This PR tackles a specific context overflow issue for the MiniMax provider, maintaining potential model performance levels. Here are points for your consideration: \n\n1. Provider-Specific Handling: Ensure the safeguards introduced for MiniMax do not inadvertently affect other providers during similar overflow scenarios. Conduct thorough regression testing across various contexts. \n\n2. Documentation: Clear documentation around the parsing logic variations between providers will enhance maintainability and future updates. Providing context here is essential for understanding different operational environments. \n\nGreat job addressing a critical area.** Overall, I endorse these changes, but clarity and safeguarding are key.**

@maelrx

maelrx commented Apr 13, 2026

Copy link
Copy Markdown
Contributor Author

Thanks — agreed on both points.

This change is intentionally scoped to MiniMax only:

  • provider in {minimax, minimax-cn}
  • or MiniMax Anthropic-compatible base URLs

I also added regression coverage for both sides:

  • MiniMax delta-only overflow keeps the known context length
  • non-MiniMax providers with the same message pattern still follow the generic probe-down path

On documentation/maintainability: agreed as well. I opened a follow-up architecture issue to track the broader cleanup around provider-specific overflow semantics and base-vs-effective context handling: #9181.

teknium1 added a commit that referenced this pull request Apr 23, 2026
For release-notes attribution of PR #9170 (MiniMax context preservation).
@teknium1

Copy link
Copy Markdown
Contributor

Merged via #14743 — your commit was cherry-picked onto current main with authorship preserved via rebase-merge (branch was 1,508 commits behind). Thanks for the fix — it resolves the MiniMax M2.7 context-halving bug reported in Discord. Also added you to scripts/release.py AUTHOR_MAP for release-notes attribution.

@teknium1 teknium1 closed this Apr 23, 2026
@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder provider/minimax MiniMax (Anthropic transport) labels Apr 23, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Superseded by #14743 which salvaged this fix onto current main.

nekorytaylor666 pushed a commit to nekorytaylor666/hermes-agent that referenced this pull request Apr 24, 2026
For release-notes attribution of PR NousResearch#9170 (MiniMax context preservation).
ulasbilgen pushed a commit to ulasbilgen/hermes-adhd-agent that referenced this pull request May 1, 2026
For release-notes attribution of PR NousResearch#9170 (MiniMax context preservation).
aj-nt pushed a commit to aj-nt/hermes-agent that referenced this pull request May 1, 2026
For release-notes attribution of PR NousResearch#9170 (MiniMax context preservation).
donald131 pushed a commit to donald131/hermes-agent that referenced this pull request May 2, 2026
For release-notes attribution of PR NousResearch#9170 (MiniMax context preservation).
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
For release-notes attribution of PR NousResearch#9170 (MiniMax context preservation).
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
For release-notes attribution of PR NousResearch#9170 (MiniMax context preservation).
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
For release-notes attribution of PR NousResearch#9170 (MiniMax context preservation).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists provider/minimax MiniMax (Anthropic transport) type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants