fix(context): live mid-turn context bar + visible auto-compression status#126
Merged
Conversation
… desktop statusbar The context bar froze during long tool batches: tool.start/complete events carried context_compressor.last_prompt_tokens, a snapshot from the previous API call that never moves while tool results pile up mid-turn. - agent/context_compressor.py: track last_prompt_messages_len alongside last_prompt_tokens; new live_context_tokens(messages) prices the tail appended since the snapshot on top of the provider-exact base, falls back to max(base, rough) and prefers the rough estimate right after compression - agent/conversation_loop.py: record the snapshot at every last_prompt_tokens write (real usage + both preflight seeds); emit a 'Compression complete' lifecycle status after preflight compaction - agent/tool_executor.py: extract the four copy-pasted usage blocks into _context_usage_for_tool_events() which uses the live estimate - apps/desktop: handle status.update gateway events (previously dropped on the floor) in a new $sessionActivityStatus atom, cleared by the next stream activity; statusbar shows the transient status (auto-compression progress finally visible mid-session); GatewayEventPayload gains 'kind' - cli.py: stale standalone-CLI compression threshold default 0.50 -> 0.85, matching hermes_cli/config.py - tests: 9 new Python tests (live estimate math, snapshot bookkeeping, tool-event payloads), 3 new desktop tests (status.update lifecycle) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
🔎 Lint report:
|
| Rule | Count |
|---|---|
unresolved-import |
1 |
First entries
tests/agent/test_live_context_estimate.py:11: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
✅ Fixed issues: none
Unchanged: 5536 pre-existing issues carried over.
Diagnostics are surfaced as warnings — this check never fails the build.
…s.io in AUTHOR_MAP Commit 4351acb raised the default compression threshold to 0.85 but left the two constructor-default tests asserting 0.50, red on main since June 8. Update them to the live default and add a dedicated small-context case so the 64K floor stays exercised. Also map omar@kostudios.io -> OmarB97 so check-attribution passes on direct (non-merge) commits. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
OmarB97
pushed a commit
that referenced
this pull request
Jun 10, 2026
… fork consolidation; finish fork-feature ports Per-cluster restoration with the test suite as the oracle, after comparing the merged tree's failures against a pristine-upstream run in the same environment (14 file-level deltas, now zero): - gateway/run.py: upstream wholesale (fork's monolith had undone the mixin decomposition; both real fork deltas re-applied — voice_ack_callback **kwargs; the custom-providers context-length fix exists upstream). - agent/conversation_loop.py + turn_context.py: upstream structure with the fork features regrafted at their new homes — sender_device attribution (#131), preflight token-usage emission + compression-complete status and live-estimate snapshots (#126). - agent/chat_completion_helpers.py: upstream wholesale (brings the second partial-stream-stub routing site and the NousResearch#6600 cancellation fix). - agent/tool_executor.py: usage= kwarg on tool start/complete callbacks now falls back to the bare 3-arg form for legacy receivers. - tools/approval.py: upstream's resolved-HERMES_HOME rewrite + normalize steps restored alongside the fork's self-host kill guard (#128). - hermes_cli/main.py: desktop install-identity stale-build cluster and the post-subcommand global-flag hoister ported from fork main. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
Two halves of the "live compression / streaming updates mid-session" bug. (1) The desktop context bar froze during long tool batches: tool.start/complete events carried
context_compressor.last_prompt_tokens, a snapshot from the previous API call that never moves while tool results accumulate, so the bar only jumped at the next API call. (2) Auto-compression was invisible in the desktop app: the agent and gateway already emitstatus.updatelifecycle events for preflight compaction, but the desktop event handler dropped them on the floor — sessions just looked hung for the duration.What changed
agent/context_compressor.py: tracklast_prompt_messages_lenalongsidelast_prompt_tokens; newlive_context_tokens(messages)returns the provider-exact base plus a rough estimate of only the messages appended since the snapshot, withmax(base, rough)fallback and a post-compression guard that prefers the fresh rough estimate whileawaiting_real_usage_after_compression.agent/conversation_loop.py: record the messages-length snapshot at everylast_prompt_tokenswrite site (real-usage update + both preflight seeds); emit a "📦 Compression complete: ~X → ~Y tokens." lifecycle status after the preflight compaction loop.agent/tool_executor.py: the four copy-pasted usage-dict blocks collapse into one_context_usage_for_tool_events()helper that uses the live estimate (net −60 lines).apps/desktop: new$sessionActivityStatusatom +status.updatehandling inuse-message-stream.ts(kinds lifecycle/compressing/process/status;readyand any stream activity clear it); statusbar renders the transient status with a spinner;GatewayEventPayloadgainskind.cli.py: stale standalone-CLI compression threshold default 0.50 → 0.85, matchinghermes_cli/config.py:1125.How to review
agent/context_compressor.py—live_context_tokens(the core math) and theupdate_from_responsesignature change.agent/tool_executor.py— confirm the helper is behavior-identical to the four removed blocks apart from the live estimate.agent/conversation_loop.py— the three snapshot writes and the compression-complete emit.store/session.ts→use-message-stream.ts→use-statusbar-items.tsxin that order; the new tests inuse-message-stream.test.tsxdocument the lifecycle.Evidence
tests/agent/test_live_context_estimate.py::TestLiveContextTokens::test_prices_tail_appended_since_snapshotandtest_grows_monotonically_as_results_appendfail on main's frozen-snapshot behavior by construction.use-message-stream.test.tsx"status.update events" suite (3 tests) covers surface/ignore/clear.Verification
tests/agent/test_live_context_estimate.py— 9 passed.tests/agent/test_turn_context.py test_compression_logging_session_context.py test_context_engine.py— 26 passed;tests/tui_gateway/test_protocol.py— 59 passed;tests/agent/test_tool_dispatch_helpers.py test_tool_guardrails.py test_display_tool_failure.py— 64 passed.vitestonuse-message-stream.test.tsx— 5 passed (3 new). Full desktop suite: 444 passed / 12 failed — the 12 failures and 22tsc -berrors are byte-identical on cleanorigin/main(verified via stash baseline; pre-existingyolo-sessionstore exports missing, tracked separately).tests/agent/test_context_compressor.pyhangs in this sandbox on a network-bound OpenRouter live-fetch test — reproduced identically on cleanorigin/mainwith changes stashed, environmental not regression.Risks / gaps
update_from_responsecorrects it at the next API call.$desktopYoloDefaultstore exports → 22 tsc errors, 12 vitest failures) is untouched here; task filed/linked in the mesh backlog (hermes-desktop-yolo-store-exports).codex_runtime.pycallsupdate_from_responsewithoutmessages_lenand keeps fallback behavior — accepted scope, the codex app-server runtime tracks context via its own turn metadata.Collaborators