Skip to content

fix(compressor): pass threshold_percent through update_model on model switch#18638

Open
liuhao1024 wants to merge 1 commit into
NousResearch:mainfrom
liuhao1024:fix/issue-18617-threshold-percent-model-switch
Open

fix(compressor): pass threshold_percent through update_model on model switch#18638
liuhao1024 wants to merge 1 commit into
NousResearch:mainfrom
liuhao1024:fix/issue-18617-threshold-percent-model-switch

Conversation

@liuhao1024

@liuhao1024 liuhao1024 commented May 2, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

When the model changes at runtime (via /model command, fallback activation, or primary restore), ContextCompressor.update_model() recalculates threshold_tokens from context_length × threshold_percent. But threshold_percent was never passed through — it retained the value from whichever model the compressor was initialized with.

This means switching from a 200K model (threshold at 70% = 140K) to a 1M model would still use 70% → 700K threshold, even if a different percentage was configured or expected for the new model.

Related Issue

N/A

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)

Changes Made

  • agent/context_compressor.py: Add optional threshold_percent parameter to update_model(). When provided, updates self.threshold_percent before recalculating threshold_tokens.
  • run_agent.pyswitch_model(): Pass the compressor's current threshold_percent through update_model() so it's preserved across model switches.
  • run_agent.py_try_activate_fallback(): Same — pass threshold_percent so fallback doesn't lose the configured value.
  • run_agent.py_restore_primary_runtime(): Read compressor_threshold_percent from the saved _primary_runtime dict and pass it to update_model(). Also save threshold_percent in _primary_runtime during switch_model().
  • tests/agent/test_update_model_threshold_percent.py: 5 tests covering the new parameter, backward compatibility (no-op when not passed), threshold floor preservation, and token recalculation.

How to Test

  1. Run pytest tests/ -q — all tests should pass
  2. Verify the specific scenario described above is resolved

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix/feature (no unrelated commits)
  • I've run pytest tests/ -q and all tests pass
  • I've added tests for my changes (required for bug fixes, strongly encouraged for features)
  • I've tested on my platform: macOS 26.4.1

Documentation & Housekeeping

  • I've updated relevant documentation (README, docs/, docstrings) — or N/A
  • I've updated cli-config.yaml.example if I added/changed config keys — or N/A
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture and workflows — or N/A
  • I've considered cross-platform impact (Windows, macOS) per the compatibility guide — or N/A

… switch

When the model changes at runtime (via /model command, fallback activation,
or primary restore), ContextCompressor.update_model() recalculates
threshold_tokens from context_length × threshold_percent. But
threshold_percent was never updated — it retained the old model's value.

This commit:
- Adds threshold_percent parameter to update_model()
- Passes the current threshold_percent at all three model-switch call sites
- Saves threshold_percent in _primary_runtime for restore
- Uses the saved value when restoring primary runtime

Closes NousResearch#18617
@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder labels May 2, 2026
Cyrene963 pushed a commit to Cyrene963/hermes-agent that referenced this pull request May 3, 2026
Community PRs applied:
- NousResearch#18596: Enable secret redaction by default (SECURITY)
- NousResearch#18650: Sanitize malformed tool messages + auto-recover on API 400
- NousResearch#18607: Emergency compression before max_iterations exhaustion
- NousResearch#18603: Compression fallback to main model on 413 rate limit
- NousResearch#18638: Pass threshold_percent on model switch
- NousResearch#18663: Strip extra_content from tool_calls for strict APIs
- NousResearch#18618: Forward explicit_api_key to OpenRouter
- NousResearch#18632: Show cache tokens in /insights breakdown
- NousResearch#18614: Add idempotency guard for patch duplicate loops
- NousResearch#18600: Raise ValueError when HERMES_HOME unset in profile mode
- NousResearch#18616: Allow ZWJ emoji in context files
- NousResearch#18582: Reload .env on /restart
- NousResearch#18547: Stabilize system prompt prefix for KV cache reuse
- NousResearch#18692: Strip FTS5 operators from session search truncation terms

Fix: Add order_by_last_active=True to list_sessions_rich call
(pre-existing commit 142b4bf code sync)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants