Skip to content

fix(state): resume compression continuation tip#15742

Open
stephenschoettler wants to merge 2 commits into
NousResearch:mainfrom
stephenschoettler:fix/resume-compression-tip
Open

fix(state): resume compression continuation tip#15742
stephenschoettler wants to merge 2 commits into
NousResearch:mainfrom
stephenschoettler:fix/resume-compression-tip

Conversation

@stephenschoettler

@stephenschoettler stephenschoettler commented Apr 25, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

Fixes resume behavior for compressed sessions by redirecting resume targets to the live compression continuation tip.

After context compression, post-compression turns live in a child continuation session. Users may still resume the original root session id they saw before compression. SessionDB.resolve_resume_session_id() previously returned that root unchanged whenever the root already had pre-compression message rows, which can omit post-compaction turns from resumed context.

This PR makes the helper follow the compression chain tip first, while preserving the existing fallback for empty placeholder sessions that have child messages.

Related Issue

Related: #10373

Related PRs:

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 🔒 Security fix
  • 📝 Documentation update
  • ✅ Tests (adding or improving test coverage)
  • ♻️ Refactor (no behavior change)
  • 🎯 New skill (bundled or hub)

Changes Made

  • hermes_state.py
    • Update SessionDB.resolve_resume_session_id() to call get_compression_tip() before returning a session with existing message rows.
    • Preserve the legacy fallback for empty placeholder sessions by walking descendants until the first child with messages is found.
    • Keep the malformed-chain depth guard behavior.
  • tests/test_hermes_state.py
    • Add regression coverage for compressed roots that already have pre-compression messages.
    • Add coverage that uncompressed sessions with messages still resolve to themselves.

How to Test

  1. Run the focused regression suite:

    python -m pytest \
      tests/test_hermes_state.py::TestCompressionChainProjection::test_resolve_resume_session_id_returns_compression_tip_even_when_parent_has_messages \
      tests/test_hermes_state.py::TestCompressionChainProjection::test_resolve_resume_session_id_returns_self_for_uncompressed_session_with_messages \
      tests/test_hermes_state.py::TestCompressionChainProjection::test_get_compression_tip_walks_full_chain \
      tests/test_hermes_state.py::TestCompressionChainProjection::test_list_surfaces_tip_for_compressed_root \
      -q -o addopts=''
  2. Run the full hermes_state test module:

    python -m pytest tests/test_hermes_state.py -q -o addopts=''
  3. Run syntax and whitespace checks:

    python -m compileall -q hermes_state.py tests/test_hermes_state.py
    git diff --check

Validation Status

  • Focused regression suite from fresh recheck: 4 passed.
  • python -m pytest tests/test_hermes_state.py -q -o addopts='': passed locally in the original PR validation.
  • python -m compileall -q hermes_state.py tests/test_hermes_state.py: passed locally in the original PR validation.
  • git diff --check: passed locally in the original PR validation.
  • GitHub checks are not currently all green.
  • Failed checks: test, e2e.
  • Current merge state: mergeable but unstable.
  • Full-suite checklist is not marked complete because this branch needs rebasing/refreshing after the base-CI stabilizer lands.

Checklist

Code

Documentation and Housekeeping

  • I've updated relevant documentation (README, docs/, docstrings) or N/A
  • I've updated cli-config.yaml.example if I added/changed config keys or N/A
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows or N/A
  • I've considered cross-platform impact (Windows, macOS) per the compatibility guide or N/A
  • I've updated tool descriptions/schemas if I changed tool behavior or N/A

For New Skills

N/A. This PR does not add a skill.

Screenshots / Logs

Fresh helper-level repro from a previous PR comment against origin/main:

get_compression_tip(root1)= tip1
resolve_resume_session_id(root1)= root1

Expected behavior after this PR:

resolve_resume_session_id(root1)= tip1

Fresh focused validation from the previous PR comment:

4 passed, 4 warnings in 0.23s

@stephenschoettler

Copy link
Copy Markdown
Contributor Author

CI note: the red Tests job appears to be pre-existing main instability, not from this two-file session-resume change.

Local validation passed:

  • python -m pytest tests/test_hermes_state.py -q -o addopts=''
  • python -m compileall -q hermes_state.py tests/test_hermes_state.py
  • git diff --check

Recent main Tests runs are also failing before this PR, including main run 24936215121 for 648b899. The PR's Nix, lockfile, attribution, and supply-chain checks passed.

@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder labels Apr 25, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Competes with open PR #13374 — both fix resolve_resume_session_id() to follow compression chain tips. Fixes #10373.

@stephenschoettler

Copy link
Copy Markdown
Contributor Author

Thanks, good catch. I should have checked open overlapping PRs before opening this. That was an agent workflow miss, not intentional duplication.

I looked at #13374 now. My read is that #13374 fixes the TUI/gateway session.resume path by projecting the selected target through get_compression_tip(). This PR is narrower at the state helper layer: resolve_resume_session_id() currently returns the root unchanged when the root already has pre-compression message rows, which can still miss post-compression continuation turns for callers that rely on that helper.

If maintainers prefer consolidating this into #13374, I am happy to close/rebase. Otherwise I think the regression in this PR is still relevant because it covers the "compressed root already has messages" edge case in SessionDB.resolve_resume_session_id() directly.

@stephenschoettler

Copy link
Copy Markdown
Contributor Author

Fresh recheck after #10373 was closed and #16306 merged: I do not think this PR should be closed as a duplicate yet.

Current main still has the helper-level edge case this PR targets. Repro against origin/main:

get_compression_tip(root1)= tip1
resolve_resume_session_id(root1)= root1

That means SessionDB.get_compression_tip() can see the live continuation tip, but SessionDB.resolve_resume_session_id() still returns the compressed root unchanged when the root already has pre-compression message rows.

Why I think this PR is still relevant:

Fresh validation on this PR branch:

python -m pytest \
  tests/test_hermes_state.py::TestCompressionChainProjection::test_resolve_resume_session_id_returns_compression_tip_even_when_parent_has_messages \
  tests/test_hermes_state.py::TestCompressionChainProjection::test_resolve_resume_session_id_returns_self_for_uncompressed_session_with_messages \
  tests/test_hermes_state.py::TestCompressionChainProjection::test_get_compression_tip_walks_full_chain \
  tests/test_hermes_state.py::TestCompressionChainProjection::test_list_surfaces_tip_for_compressed_root \
  -q -o addopts=''

Result:

4 passed, 4 warnings in 0.23s

If maintainers prefer folding this helper fix into #13374, that is fine. Otherwise I think this PR should stay open because it covers the helper-level regression directly.

@stephenschoettler stephenschoettler changed the title fix: resume compression continuation tip fix(state): resume compression continuation tip Apr 30, 2026
@stephenschoettler

Copy link
Copy Markdown
Contributor Author

Rechecked this against current origin/main: the helper-level resume edge still reproduces.

Local repro on current main:

get_compression_tip(root1)= tip1
resolve_resume_session_id(root1)= root1

I refreshed this branch by merging current origin/main. Targeted validation passed:

  • scripts/run_tests.sh tests/test_hermes_state.py -- -q --tb=short (218 passed)
  • python -m compileall -q hermes_state.py tests/test_hermes_state.py

This still looks distinct from the TUI/gateway tip PRs (#13374, #26631): those update gateway/TUI resume projection, while this fixes SessionDB.resolve_resume_session_id() itself. Leaving it open, but now current enough to evaluate on its own merits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants