fix(context-compression): fallback to main model when summary_model_override is None and provider returns 413 by vominh1919 · Pull Request #18603 · NousResearch/hermes-agent

vominh1919 · 2026-05-02T01:29:13Z

Problem

When summary_model_override is not configured (the default, None), self.summary_model is set to an empty string ("") which is falsy. The fallback conditions in _generate_summary() both check self.summary_model:

# Line 906 (model-not-found fast path)
if (
    _is_model_not_found
    and self.summary_model          # ← falsy when ""
    and self.summary_model != self.model
    ...
):

# Line 939 (unknown-error best-effort retry)
if (
    self.summary_model              # ← falsy when ""
    and self.summary_model != self.model
    ...
):

When self.summary_model is empty, both conditions short-circuit — no fallback to the main model ever happens. This means users with no explicit summary_model_override get zero compression once the default provider hits rate limits (413 TPM exhausted), because there is no retry on the main model.

Fix

Add _is_rate_limited check — detects 413 status, "rate limit", "TPM", "tokens per minute" error strings
Broaden fallback conditions — allow fallback when summary_model is empty AND the error is rate-limit or model-not-found (not generic errors, to avoid pointless retries when the default provider IS the main model)
Explicit model on retry — when falling back from empty summary_model, set it to self.model so the retry explicitly uses the main model instead of the default provider that may differ
Add regression test — test_empty_summary_model_413_falls_back_to_main verifies the 413 → fallback → success flow

Before vs After

Scenario	Before	After
`summary_model_override=None`, provider returns 413	❌ No retry, returns None	✅ Retries on main model
`summary_model_override=None`, provider returns 404	❌ No retry, returns None	✅ Retries on main model
`summary_model_override=None`, generic error	❌ No retry, returns None	❌ No retry (correct — same provider)
`summary_model_override="other"`, any error	✅ Retries on main	✅ Retries on main (unchanged)

Tests

All 67 existing tests pass + 1 new regression test added.

Fixes #18588

…rovider returns 413 When summary_model_override is not configured (None), self.summary_model is set to empty string which is falsy. The fallback conditions at lines 906-911 and 939-943 both check self.summary_model, so when it is empty, no fallback happens — even for rate-limit (413) or model-not-found errors. This means users with no explicit summary_model_override configured get zero compression once the default provider hits rate limits, because there is no retry on the main model. Fix: - Add _is_rate_limited check (413, rate limit, TPM, tokens per minute) - Allow fallback when summary_model is empty AND error is rate-limit or model-not-found (not generic errors, to avoid pointless retries) - When falling back from empty summary_model, explicitly set it to self.model so the retry uses the main model instead of the default provider that may differ - Add test_empty_summary_model_413_falls_back_to_main regression test Fixes NousResearch#18588

Community PRs applied: - NousResearch#18596: Enable secret redaction by default (SECURITY) - NousResearch#18650: Sanitize malformed tool messages + auto-recover on API 400 - NousResearch#18607: Emergency compression before max_iterations exhaustion - NousResearch#18603: Compression fallback to main model on 413 rate limit - NousResearch#18638: Pass threshold_percent on model switch - NousResearch#18663: Strip extra_content from tool_calls for strict APIs - NousResearch#18618: Forward explicit_api_key to OpenRouter - NousResearch#18632: Show cache tokens in /insights breakdown - NousResearch#18614: Add idempotency guard for patch duplicate loops - NousResearch#18600: Raise ValueError when HERMES_HOME unset in profile mode - NousResearch#18616: Allow ZWJ emoji in context files - NousResearch#18582: Reload .env on /restart - NousResearch#18547: Stabilize system prompt prefix for KV cache reuse - NousResearch#18692: Strip FTS5 operators from session search truncation terms Fix: Add order_by_last_active=True to list_sessions_rich call (pre-existing commit 142b4bf code sync)

vominh1919 mentioned this pull request May 2, 2026

fix(context-compression): no fallback to main model when summary_model_override is None and Groq returns 413 #18588

Open

alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder labels May 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(context-compression): fallback to main model when summary_model_override is None and provider returns 413#18603

fix(context-compression): fallback to main model when summary_model_override is None and provider returns 413#18603
vominh1919 wants to merge 1 commit into
NousResearch:mainfrom
vominh1919:fix/context-compression-fallback-empty-model

vominh1919 commented May 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

vominh1919 commented May 2, 2026

Problem

Fix

Before vs After

Tests

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants