fix: align model context length across display and runtime by leavrcn · Pull Request #11437 · NousResearch/hermes-agent

leavrcn · 2026-04-17T06:02:21Z

Summary

align /model display context resolution with runtime precedence for custom providers and global overrides
refresh runtime context-length overrides safely across model switches and preserve the active override if config reload fails
unify gateway session-info and hygiene paths with the same resolver, plus add regression tests for negative values, bools, and provider-name collisions

Verification

pytest -q tests/hermes_cli/test_model_switch_context_alignment.py tests/hermes_cli/test_model_switch_opencode_anthropic.py tests/gateway/test_model_command_custom_providers.py
python -m py_compile cli.py gateway/run.py hermes_cli/model_switch.py run_agent.py

Notes

leaves unrelated local modifications (for example gateway/config.py, package-lock.json, and unrelated docs drafts) out of this PR

leavrcn · 2026-04-17T07:45:20Z

Verification Result

After restarting hermes-gateway, I re-tested the LCM/context compression runtime behavior for the active gpt-5.4 session.

Expected

context_length = 524288
threshold_tokens = 393216

Runtime Check

Using lcm_status(), the active session reports:

context_length: 524288
threshold_tokens: 393216

This confirms the runtime values are applied correctly, not just present in config.

Real Compression Test

I then executed a direct ContextCompressor.compress() test in the Hermes project virtualenv with:

model: gpt-5.4
provider: yunfeiplus
threshold_percent = 0.75
config_context_length = 524288
current_tokens = threshold_tokens + 5000

Observed result:

context_length=524288
threshold_tokens=393216
input_messages=39
output_messages=33
summary_count=1
summary_prefix_ok=True

A valid compaction summary was generated with the expected prefix:

[CONTEXT COMPACTION — REFERENCE ONLY]

Conclusion

Verification passed.

Confirmed:

LCM context_length is now 524288
threshold_tokens is now 393216
Real compression is functioning correctly when the token count exceeds threshold
This is a runtime-confirmed result, not only a static config change

alt-glitch · 2026-04-24T23:08:30Z

Overlaps with #12316 — both unify context-length resolution across runtime and /model display. Also related to #12380, #14382, and #8785.

teknium1 · 2026-04-27T04:34:07Z

Thanks for the careful work on this, @leavrcn! The context-length alignment bug you identified was independently fixed on main after this PR was filed.

This is an automated hermes-sweeper review.

Commit 125de0205 (fix(context): honor custom_providers context_length on /model switch + bump probe tier to 256K — fix(context): honor custom_providers context_length on /model switch + bump probe tier to 256K #15844) landed on main ~April 25 and addresses the same display/runtime divergence:
- New hermes_cli.config.get_custom_provider_context_length() helper — single source of truth for per-model overrides
- resolve_display_context_length() in hermes_cli/model_switch.py gained a custom_providers= kwarg (line 530)
- AIAgent.switch_model() in run_agent.py re-reads custom_providers from live config on every /model switch (line 2149)
- Two gateway/run.py call sites wired up (lines 5529-5530, 5676-5677)
- Regression tests added: tests/hermes_cli/test_model_switch_context_display.py and tests/hermes_cli/test_custom_provider_context_length.py

Your test_model_switch_context_alignment.py (384 lines) isn't on main yet — if it covers additional edge cases (negative values, bool context_length, provider-name collisions), maintainers may want to cherry-pick those tests separately.

fix: align model context length across display and runtime

eb8d967

This was referenced Apr 22, 2026

fix: preserve configured context lengths across runtime paths #14008

Open

fix(model): unify context-length resolution across runtime and /model display #12316

Closed

alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/cli CLI entry point, hermes_cli/, setup wizard comp/gateway Gateway runner, session dispatch, delivery comp/agent Core agent loop, run_agent.py, prompt builder labels Apr 24, 2026

teknium1 closed this Apr 27, 2026

alt-glitch mentioned this pull request Apr 29, 2026

fix: sync model.context_length on --global model switch #17245

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: align model context length across display and runtime#11437

fix: align model context length across display and runtime#11437
leavrcn wants to merge 1 commit into
NousResearch:mainfrom
leavrcn:fix/model-context-alignment

leavrcn commented Apr 17, 2026

Uh oh!

leavrcn commented Apr 17, 2026

Uh oh!

alt-glitch commented Apr 24, 2026

Uh oh!

teknium1 commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

leavrcn commented Apr 17, 2026

Summary

Verification

Notes

Uh oh!

leavrcn commented Apr 17, 2026

Verification Result

Expected

Runtime Check

Real Compression Test

Conclusion

Uh oh!

alt-glitch commented Apr 24, 2026

Uh oh!

teknium1 commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants