Skip to content

Context window changes to 256K after interrupted compaction and resume #32423

@yangguangjin

Description

@yangguangjin

Bug Description

After a Hermes CLI session is interrupted during context compaction and later resumed/continued, the effective context window shown in the status bar can change from the configured 1M tokens to 256K. In the same failure path, compression summary generation can fail with HTTP 502 and Hermes inserts a fallback context marker, which may degrade the resumed session's accuracy.

Steps to Reproduce

  1. Configure a custom OpenAI-compatible provider with a 1M context length:
model:
  default: gpt-5.5
  provider: custom
  base_url: https://redacted.example/openai/v1
  context_length: 1000000

custom_providers:
  - name: private_codex
    base_url: https://redacted.example/openai/v1
    model: gpt-5.5
    models:
      gpt-5.5:
        context_length: 1000000

compression:
  enabled: true
  threshold: 0.8
  target_ratio: 0.2
  1. Start a long Hermes CLI session and let it approach/trigger context compaction.
  2. Have the custom endpoint return HTTP 502 during compression summary generation.
  3. Continue or resume the interrupted session.
  4. Observe the CLI status bar and compaction warnings.

Expected Behavior

  • The resumed/continued session should preserve the configured context length (1000000, displayed as roughly 1M).
  • If summary generation fails, Hermes should not silently degrade context by inserting a fallback marker that loses important middle turns, or it should at least make the session state and recovery options explicit.
  • The configured context length should remain stable across interruption, compaction, resume, and continue flows.

Actual Behavior

  • The status bar showed 97.9K/256K even though the active config has model.context_length: 1000000 and the custom provider entry also sets context_length: 1000000.
  • The CLI reported:
compression summary failed: Error code: 502. Inserted a fallback context marker.
Session compressed 3 times — accuracy may degrade. Consider /new to start fresh.
API call failed (attempt 1/3): InternalServerError [HTTP 502]
Provider: custom  Model: gpt-5.5
Endpoint: https://redacted.example/openai/v1
Error: HTTP 502: Error code: 502
...
Max retries (3) exhausted — trying fallback...
API failed after 3 retries — HTTP 502: Error code: 502
Final error: HTTP 502: Error code: 502

Additional Observations

  • Running hermes config after the incident confirms the config still contains context_length: 1000000.
  • Directly calling Hermes' model metadata resolver with config_context_length=1000000 returns 1000000, so a fresh initialization should resolve to 1M.
  • The 256K value appears to come from the running/resumed agent's ContextCompressor.context_length, not from the current config file.
  • This suggests a resume/continue or compression failure path may restore or retain a stale/default context length instead of the configured value.

Environment

  • OS: Windows 10
  • Shell: Git Bash/MSYS via Hermes terminal backend
  • Hermes profile: default
  • Provider: custom OpenAI-compatible endpoint
  • Model: gpt-5.5
  • Configured context length: 1000000
  • Observed status bar context length after interruption/continue: 256K

Possible Fix Direction

  • Ensure resumed/continued sessions rehydrate ContextCompressor.context_length from the active model/custom provider config, not from stale runtime state or fallback metadata.
  • Consider making compression.abort_on_summary_failure default to safer behavior for transient provider failures, or avoid inserting a fallback marker that drops middle context when the summary request fails.
  • Surface a clearer warning when the runtime context length differs from the configured model.context_length for the active provider/model.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium — degraded but workaround existsarea/configConfig system, migrations, profilescomp/agentCore agent loop, run_agent.py, prompt buildertype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions