Skip to content

fix(compaction): restore session invariants across compaction and reset#69270

Open
de1tydev wants to merge 16 commits intoopenclaw:mainfrom
de1tydev:fix/hook-message-provider-context
Open

fix(compaction): restore session invariants across compaction and reset#69270
de1tydev wants to merge 16 commits intoopenclaw:mainfrom
de1tydev:fix/hook-message-provider-context

Conversation

@de1tydev
Copy link
Copy Markdown

@de1tydev de1tydev commented Apr 20, 2026

Summary

Restore session invariants across compaction and reset instead of patching each symptom in isolation.

This coordinated fix now covers four connected paths:

  • subscribe-time before_compaction / after_compaction hook context
  • engine-owned compaction hook context in compact.queued.ts
  • command /new / /reset hook context in emitResetCommandHooks
  • gateway sessions.reset hook context in emitGatewayBeforeResetPluginHook

It also fixes compaction bookkeeping so zero-token outcomes reset cached usage instead of leaving the session in a repeated safeguard loop.

What changed

  • introduced a shared hook-message-provider resolver so compaction/reset hooks all use the same rules
  • normalized explicit providers before emitting hook context, so values like Telegram become telegram
  • inferred providers from session keys only for real deliverable channel-scoped keys
  • kept non-channel keys such as agent:main:direct:peer_123 unresolved instead of emitting fake providers
  • reset totalTokens / totalTokensFresh / estimatedCostUsd from compaction results even when tokensAfter === 0
  • extended regression coverage for mixed-case reset channels, normalized compaction hook providers, zero-token compactions, and gateway/manual compaction bookkeeping

Why

These reports all describe the same broader failure mode: compaction/reset code paths were not preserving the session invariants downstream systems rely on.

That surfaced as different user-visible bugs:

  • hook consumers could see missing or non-canonical messageProvider values and split one real conversation into multiple logical sessions
  • compaction could clear the transcript but leave stale token/cost accounting behind, causing repeated safeguard loops

Issues

Fixes #69269
Fixes #69286

Also addresses the duplicate token-accounting report in #69287.
Related cluster: #69202.

Tests

  • pnpm build
  • pnpm check
  • pnpm exec vitest run src/auto-reply/reply/commands-core.test.ts src/agents/pi-embedded-subscribe.handlers.compaction.test.ts src/auto-reply/reply/reply-state.test.ts src/gateway/server.sessions.gateway-server-sessions-a.test.ts

Real behavior proof

  • Behavior or issue addressed: reset/compaction hook provider context now resolves real channel-scoped Feishu session keys, does not invent a provider for non-channel direct keys, and normalizes explicit mixed-case providers.
  • Real environment tested: local OpenClaw source checkout at commit c59b756 on Debian 12, Node v25.9.0, using the patched production resolver module via node --import tsx.
  • Exact steps or command run after this patch:
node --import tsx <<'EOF'
import { resolveHookMessageProvider } from './src/utils/hook-message-provider.ts';
const cases = [
  { label: 'channel scoped Feishu reset/compaction session', sessionKey: 'agent:main:feishu:default:direct:ou_testuser' },
  { label: 'legacy non-channel direct session', sessionKey: 'agent:main:direct:peer_123' },
  { label: 'explicit mixed-case provider from hook caller', sessionKey: 'agent:main:main', provider: 'Telegram' },
];
for (const item of cases) {
  console.log(`${item.label}: ${JSON.stringify({ sessionKey: item.sessionKey, provider: item.provider, resolved: resolveHookMessageProvider(item) })}`);
}
EOF
  • Evidence after fix: terminal output from the patched checkout:
channel scoped Feishu reset/compaction session: {"sessionKey":"agent:main:feishu:default:direct:ou_testuser","resolved":"feishu"}
legacy non-channel direct session: {"sessionKey":"agent:main:direct:peer_123"}
explicit mixed-case provider from hook caller: {"sessionKey":"agent:main:main","provider":"Telegram","resolved":"telegram"}
  • Observed result after fix: live output shows channel-scoped reset/compaction keys carry messageProvider: "feishu", legacy non-channel direct keys leave the provider unset, and explicit Telegram is canonicalized to telegram before hook context emission.
  • What was not tested: no browser/mobile UI recording; targeted regression tests cover the gateway reset and compaction RPC paths.

@openclaw-barnacle openclaw-barnacle Bot added gateway Gateway runtime agents Agent runtime and tooling size: S labels Apr 20, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 20, 2026

Greptile Summary

This PR propagates messageProvider into before_compaction, after_compaction, and before_reset hook contexts across three code paths: the embedded-subscribe compaction handler, the command reset flow, and the gateway sessions.reset path. For the gateway path it adds inferMessageProviderFromSessionKey, which parses canonical session keys and extracts the channel segment.

The fix is correct for the targeted channel-scoped session keys (agent:{id}:feishu:…, agent:{id}:discord:…, etc.). One edge case: per-peer DM keys of the form agent:{id}:direct:{peerId} would yield \"direct\" (a chat-type label, not a channel name) from inferMessageProviderFromSessionKey because \"direct\" is not in the sentinel list — consider adding it alongside \"main\", \"cron\", \"subagent\", and \"acp\".

Confidence Score: 5/5

Safe to merge; the edge case with "direct" keys is P2 and unlikely to affect real deployments materially.

All findings are P2. The core fix is correct and well-tested. The inferMessageProviderFromSessionKey gap is a quality improvement rather than a blocking defect.

src/gateway/session-reset-service.ts — the inferMessageProviderFromSessionKey sentinel list.

Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/gateway/session-reset-service.ts
Line: 53-63

Comment:
**`inferMessageProviderFromSessionKey` misses non-channel `rest` prefixes**

Session keys built with `dmScope === "per-peer"` use the format `agent:{agentId}:direct:{peerId}`, so `rest.split(":")[0]` yields `"direct"` — a chat-type segment, not a channel name. Since `"direct"` is not in the sentinel list, `normalizeMessageChannel("direct")` is returned as `messageProvider`, giving downstream plugins an incorrect, non-channel value. Similarly, dashboard-scoped keys (`agent:ops:dashboard:…`) would surface `"dashboard"`. For those key shapes the function should return `undefined` to stay consistent with the "unknown provider" fallback the PR is trying to eliminate.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix(hooks): preserve messageProvider in ..." | Re-trigger Greptile

Comment thread src/gateway/session-reset-service.ts Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2f5b0e0e52

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/gateway/session-reset-service.ts Outdated
@de1tydev
Copy link
Copy Markdown
Author

Addressed the follow-up issues in a6e53dfd81.

  • threaded messageProvider through the subscribe-time compaction hook path instead of only reading it from SubscribeEmbeddedPiSessionParams without plumbing
  • tightened gateway inference so only deliverable channels are returned, leaving keys like agent:main:direct:peer_123 without an inferred provider
  • added a regression covering the non-channel direct session-key case

Local verification:

  • pnpm build
  • pnpm check
  • node node_modules/vitest/vitest.mjs run src/agents/pi-embedded-subscribe.handlers.compaction.test.ts src/auto-reply/reply/commands-core.test.ts src/gateway/server.sessions.gateway-server-sessions-a.test.ts

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a6e53dfd81

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/auto-reply/reply/commands-reset-hooks.ts Outdated
@de1tydev de1tydev changed the title fix(hooks): preserve messageProvider in compaction and reset contexts fix(compaction): restore session invariants across compaction and reset Apr 20, 2026
@de1tydev
Copy link
Copy Markdown
Author

Expanded this into a coordinated compaction/reset invariant fix, per the review suggestion.

Included in this update:

  • shared hook messageProvider resolution across subscribe compaction, engine-owned compaction, command reset, and gateway sessions.reset
  • normalization of explicit providers before emitting hook context
  • strict inference only for real deliverable channel-scoped session keys
  • zero-token compaction results now reset cached token/cost accounting instead of leaving safeguard-loop state behind

I also updated the issue/PR title and description to reflect the broader scope.

Local verification:

  • pnpm build
  • pnpm check
  • pnpm exec vitest run src/auto-reply/reply/commands-core.test.ts src/agents/pi-embedded-subscribe.handlers.compaction.test.ts src/auto-reply/reply/reply-state.test.ts src/gateway/server.sessions.gateway-server-sessions-a.test.ts

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2171c3b21f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/utils/hook-message-provider.ts Outdated
Comment thread src/utils/hook-message-provider.ts Outdated
@de1tydev de1tydev requested a review from a team as a code owner April 21, 2026 05:20
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ce4db4ebec

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/gateway/server-methods/sessions.ts
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 046100e827

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/pi-embedded-runner/run/attempt.ts Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 53dd9aec43

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/utils/hook-message-provider.ts Outdated
@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented Apr 28, 2026

Codex review: needs real behavior proof before merge.

Summary
The branch adds shared hook message-provider resolution, threads provider context through reset and compaction hooks, handles zero-token compaction accounting/cache cleanup, passes doctor env context, and adds tests plus a changelog entry.

Reproducibility: yes. by source inspection. Current main omits messageProvider on reset and subscribe-time compaction hook contexts, and its positive-only token guards skip zero-token compaction results.

Real behavior proof
Needs stronger real behavior proof before merge: The PR includes copied terminal output from the patched resolver, but it does not show the changed reset/compaction hook paths or zero-token accounting behavior running after the fix.

Next step before merge
Human merge gate only: the branch has no narrow automation repair blocker, but contributor proof is still too helper-level for the changed reset/compaction/accounting behavior.

Security
Cleared: Cleared: the diff adds no dependencies, workflow/package execution changes, downloads, lockfile changes, or secret-output path; the doctor env context stays within existing health contribution plumbing.

Review details

Best possible solution:

Land this PR or an equivalent focused fix after real path-level proof and maintainer validation so every reset/compaction hook path uses stable provider context and zero-token compactions clear stale accounting.

Do we have a high-confidence way to reproduce the issue?

Yes, by source inspection. Current main omits messageProvider on reset and subscribe-time compaction hook contexts, and its positive-only token guards skip zero-token compaction results.

Is this the best way to solve the issue?

Yes, the shared resolver plus nonnegative token accounting is the narrow maintainable fix shape for this bug cluster. The remaining blocker is proof quality, not a concrete code defect found in this review.

What I checked:

  • Current main reset hook context lacks provider context: emitResetCommandHooks sends agentId, sessionKey, sessionId, and workspaceDir to before_reset without messageProvider. (src/auto-reply/reply/commands-reset-hooks.ts:151, fb2056750004)
  • Current main gateway reset path lacks provider inference: emitGatewayBeforeResetPluginHook builds before_reset context without messageProvider, so channel-scoped session keys cannot identify the provider on this path. (src/gateway/session-reset-service.ts:468, fb2056750004)
  • Current main subscribe compaction hooks only pass sessionKey: handleCompactionStart and handleCompactionEnd call compaction hooks with sessionKey only, omitting messageProvider from subscribe-time hook contexts. (src/agents/pi-embedded-subscribe.handlers.compaction.ts:32, fb2056750004)
  • Current main ignores zero-token compaction results: resolvePositiveTokenCount and noteCompactionTokensAfter require values greater than zero, so a valid tokensAfter/compactionTokensAfter value of 0 is skipped instead of refreshing cached usage to zero. (src/auto-reply/reply/session-run-accounting.ts:21, fb2056750004)
  • PR diff centralizes provider resolution: The PR adds src/utils/hook-message-provider.ts and wires it into command reset, gateway reset, subscribe-time compaction, direct compaction, and queued compaction paths, with tests for channel-scoped, direct-key, mixed-case, internal, and synthetic providers. (src/utils/hook-message-provider.ts:1, c59b75610531)
  • Prior review follow-ups were addressed in branch discussion: The PR discussion shows review comments for non-channel direct keys, mixed-case normalization, internal/synthetic providers, cache field cleanup, and channel fallback; follow-up comments from the contributor and vincentkoc state those repairs were pushed and validated. (c59b75610531)

Likely related people:

  • steipete: Current blame on the affected reset, gateway reset, subscribe compaction, and token-accounting lines points to Peter Steinberger, and prior ClawSweeper context links repeated main-branch work in these same files. (role: recent maintainer / compaction and gateway owner; confidence: medium; commits: 7f27c42ebdb3, 59fb9e5ca7fe, 3ee5490c6001; files: src/auto-reply/reply/commands-reset-hooks.ts, src/gateway/session-reset-service.ts, src/agents/pi-embedded-subscribe.handlers.compaction.ts)
  • vincentkoc: vincentkoc pushed the ProjectClownfish repair commits to this PR branch, provided validation context, and prior ClawSweeper context links recent gateway/refactor history near the affected reset surface. (role: maintainer repair contributor / adjacent gateway owner; confidence: medium; commits: bfd0c001fc18, fed80eb309a2, 1497425b8dbb; files: src/gateway/server-methods/sessions.ts, src/gateway/session-reset-service.ts, src/utils/hook-message-provider.ts)

Remaining risk / open question:

  • The submitted runtime proof only covers the resolver helper, not actual reset/compaction hook emission or zero-token accounting/cache cleanup.
  • The diff spans several gateway, command, and embedded-runner paths, so exact-head CI/Testbox validation should remain a merge gate.

Codex review notes: model gpt-5.5, reasoning high; reviewed against fb2056750004.

@vincentkoc vincentkoc force-pushed the fix/hook-message-provider-context branch from 9619013 to 054a6a8 Compare April 28, 2026 07:27
@vincentkoc
Copy link
Copy Markdown
Member

ProjectClownfish pushed a narrow repair to this branch so the original contributor path can stay canonical.

Source PR: #69270
Validation: pnpm exec vitest run src/utils/hook-message-provider.test.ts src/agents/pi-embedded-subscribe.handlers.compaction.test.ts src/auto-reply/reply/commands-core.test.ts src/auto-reply/reply/reply-state.test.ts src/gateway/server.sessions.gateway-server-sessions-a.test.ts; pnpm check:changed
Contributor credit is preserved in the branch history and PR context.

@openclaw-barnacle openclaw-barnacle Bot removed the channel: slack Channel integration: slack label Apr 28, 2026
@vincentkoc vincentkoc force-pushed the fix/hook-message-provider-context branch from 054a6a8 to fed80eb Compare April 28, 2026 18:38
@vincentkoc
Copy link
Copy Markdown
Member

ProjectClownfish pushed a narrow repair to this branch so the original contributor path can stay canonical.

Source PR: #69270
Validation: pnpm exec vitest run src/utils/hook-message-provider.test.ts src/agents/pi-embedded-subscribe.handlers.compaction.test.ts src/auto-reply/reply/commands-core.test.ts src/auto-reply/reply/reply-state.test.ts src/gateway/server.sessions.gateway-server-sessions-a.test.ts; pnpm check:changed
Contributor credit is preserved in the branch history and PR context.

@vincentkoc vincentkoc added clawsweeper Tracked by ClawSweeper automation and removed clownfish:human-review labels Apr 28, 2026
…ovider-context

# Conflicts:
#	CHANGELOG.md
#	src/agents/pi-embedded-subscribe.handlers.compaction.test.ts
#	src/auto-reply/reply/commands-reset-hooks.ts
#	src/gateway/server.sessions.gateway-server-sessions-a.test.ts
@openclaw-barnacle openclaw-barnacle Bot added triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling clawsweeper Tracked by ClawSweeper automation gateway Gateway runtime size: L

Projects

None yet

2 participants