Skip to content

fix: align model context length across display and runtime#11437

Closed
leavrcn wants to merge 1 commit into
NousResearch:mainfrom
leavrcn:fix/model-context-alignment
Closed

fix: align model context length across display and runtime#11437
leavrcn wants to merge 1 commit into
NousResearch:mainfrom
leavrcn:fix/model-context-alignment

Conversation

@leavrcn

@leavrcn leavrcn commented Apr 17, 2026

Copy link
Copy Markdown

Summary

  • align /model display context resolution with runtime precedence for custom providers and global overrides
  • refresh runtime context-length overrides safely across model switches and preserve the active override if config reload fails
  • unify gateway session-info and hygiene paths with the same resolver, plus add regression tests for negative values, bools, and provider-name collisions

Verification

  • pytest -q tests/hermes_cli/test_model_switch_context_alignment.py tests/hermes_cli/test_model_switch_opencode_anthropic.py tests/gateway/test_model_command_custom_providers.py
  • python -m py_compile cli.py gateway/run.py hermes_cli/model_switch.py run_agent.py

Notes

  • leaves unrelated local modifications (for example gateway/config.py, package-lock.json, and unrelated docs drafts) out of this PR

@leavrcn

leavrcn commented Apr 17, 2026

Copy link
Copy Markdown
Author

Verification Result

After restarting hermes-gateway, I re-tested the LCM/context compression runtime behavior for the active gpt-5.4 session.

Expected

  • context_length = 524288
  • threshold_tokens = 393216

Runtime Check

Using lcm_status(), the active session reports:

  • context_length: 524288
  • threshold_tokens: 393216

This confirms the runtime values are applied correctly, not just present in config.

Real Compression Test

I then executed a direct ContextCompressor.compress() test in the Hermes project virtualenv with:

  • model: gpt-5.4
  • provider: yunfeiplus
  • threshold_percent = 0.75
  • config_context_length = 524288
  • current_tokens = threshold_tokens + 5000

Observed result:

  • context_length=524288
  • threshold_tokens=393216
  • input_messages=39
  • output_messages=33
  • summary_count=1
  • summary_prefix_ok=True

A valid compaction summary was generated with the expected prefix:

  • [CONTEXT COMPACTION — REFERENCE ONLY]

Conclusion

Verification passed.

Confirmed:

  1. LCM context_length is now 524288
  2. threshold_tokens is now 393216
  3. Real compression is functioning correctly when the token count exceeds threshold
  4. This is a runtime-confirmed result, not only a static config change

@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/cli CLI entry point, hermes_cli/, setup wizard comp/gateway Gateway runner, session dispatch, delivery comp/agent Core agent loop, run_agent.py, prompt builder labels Apr 24, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Overlaps with #12316 — both unify context-length resolution across runtime and /model display. Also related to #12380, #14382, and #8785.

@teknium1

Copy link
Copy Markdown
Contributor

Thanks for the careful work on this, @leavrcn! The context-length alignment bug you identified was independently fixed on main after this PR was filed.

This is an automated hermes-sweeper review.

  • Commit 125de0205 (fix(context): honor custom_providers context_length on /model switch + bump probe tier to 256K — fix(context): honor custom_providers context_length on /model switch + bump probe tier to 256K #15844) landed on main ~April 25 and addresses the same display/runtime divergence:
    • New hermes_cli.config.get_custom_provider_context_length() helper — single source of truth for per-model overrides
    • resolve_display_context_length() in hermes_cli/model_switch.py gained a custom_providers= kwarg (line 530)
    • AIAgent.switch_model() in run_agent.py re-reads custom_providers from live config on every /model switch (line 2149)
    • Two gateway/run.py call sites wired up (lines 5529-5530, 5676-5677)
    • Regression tests added: tests/hermes_cli/test_model_switch_context_display.py and tests/hermes_cli/test_custom_provider_context_length.py

Your test_model_switch_context_alignment.py (384 lines) isn't on main yet — if it covers additional edge cases (negative values, bool context_length, provider-name collisions), maintainers may want to cherry-pick those tests separately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder comp/cli CLI entry point, hermes_cli/, setup wizard comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants