fix(compaction): bypass idle skip at token threshold#73092
fix(compaction): bypass idle skip at token threshold#73092pfrederiksen wants to merge 6 commits into
Conversation
Greptile SummaryThis PR introduces Confidence Score: 5/5Safe to merge — no correctness or security issues found. Changes are narrow and well-tested. The threshold arithmetic is correct (boundary-inclusive, verified by the added unit tests). The pre-fetch refactor in commands-compact.ts is logically equivalent for the context-summary path and adds the new capability of forwarding the token count into the compaction engine. No P0/P1 findings. No files require special attention. Reviews (1): Last reviewed commit: "fix(compaction): bypass idle skip at tok..." | Re-trigger Greptile |
|
Codex review: needs maintainer review before merge. Summary Reproducibility: yes. Source inspection shows current main short-circuits on the outer no-real-conversation guard before token pressure can matter, and the linked report provides the matching 236k/200k manual Real behavior proof Next step before merge Security Review detailsBest possible solution: Land this PR or an equivalent narrow fix after current-head CI finishes, then close #73056 as implemented. Do we have a high-confidence way to reproduce the issue? Yes. Source inspection shows current main short-circuits on the outer no-real-conversation guard before token pressure can matter, and the linked report provides the matching 236k/200k manual Is this the best way to solve the issue? Yes. The threshold helper plus manual caller-budget forwarding is a narrow fix that preserves the existing classifier and ordinary queued reserve semantics; I did not find a smaller safer alternative. What I checked:
Likely related people:
Remaining risk / open question:
Codex review notes: model gpt-5.5, reasoning high; reviewed against d350ac3febc2. |
|
Addressed the ClawSweeper feedback in What changed:
Validation run locally:
|
|
@clawsweeper re-review |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
45618cc to
62819fa
Compare
|
@clawsweeper re-review |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
|
@clawsweeper re-review Updated the PR body with a required |
62819fa to
720ad06
Compare
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
720ad06 to
a170370
Compare
|
@clawsweeper re-review Addressed the P2 by passing the resolved active caller context budget into compaction instead of stale session metadata, added the stale-budget assertion, and updated the PR body with current-head compaction plus gateway/pairing proof. |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
|
@clawsweeper re-review Addressed the new P2 by preserving reserve semantics for ordinary queued compactions: only a distinct explicit callerContextTokenBudget is now treated as caller-supplied. Re-ran the focused compaction shards, typecheck, and changelog attribution. |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
f495948 to
417d7df
Compare
|
@clawsweeper re-review Rebased onto fresh upstream main and added the small run-node declaration fix needed for current-main CI. Please re-check latest head. |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
|
@clawsweeper re-review Rebased/current-main CI follow-up: fixed the run-node declaration/test break and pushed latest head. Please re-check. |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
47cac0f to
9d5b3bf
Compare
|
@clawsweeper re-review\n\nRebased onto latest upstream main at head ; addressed both compaction P2s: manual now passes the resolved active caller budget, and ordinary queued compactions preserve reserve semantics by only treating the distinct explicit caller budget as caller-supplied. PR body has updated exact-head proof. Please re-check latest head. |
|
@clawsweeper re-review Correction to the previous comment formatting: rebased onto latest upstream main at head 9d5b3bf. Addressed both compaction P2s: manual compact now passes the resolved active caller budget, and ordinary queued compactions preserve reserve semantics by only treating the distinct explicit caller budget as caller-supplied. PR body has updated exact-head proof. Please re-check latest head. |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
|
@clawsweeper re-review Added additional manual compact threshold replay proof to the PR body at latest head 9d5b3bf. It shows the production decision gate now proceeds for the reported external caller high-usage shape while preserving ordinary queued reserve semantics. Please re-check latest head. |
9d5b3bf to
b1b4cdb
Compare
|
@clawsweeper re-review Rebased onto latest upstream main at head b1b4cdb. PR body exact-head proof was refreshed; the previously accepted manual compact threshold proof is unchanged except for the rebased head SHA. Please re-check latest head. |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
|
@clawsweeper re-review Fixed the P3 changelog reference: the release note now says Fixes #73056 instead of pointing at the PR. Latest head is 7cb3988. Acceptance checks passed locally: git diff --check and pnpm check:changelog-attributions. Please re-check latest head. |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
|
Closing as superseded by current upstream main.\n\nEvidence checked on origin/main:\n- Current main contains the caller context token budget / active runtime config compaction coverage.\n- Focused validation passed:\n - pnpm exec vitest run --config test/vitest/vitest.agents-core.config.ts src/agents/embedded-agent-runner/compact.hooks.test.ts -t 'uses the caller context token budget during runtime compaction'\n - pnpm exec vitest run --config test/vitest/vitest.auto-reply.config.ts src/auto-reply/reply/commands-compact.test.ts -t 'resolves /compact context budget from the active Codex runtime config instead of stale session metadata'\n\nThe standalone PR is no longer needed. |
Summary
fix/compaction-tool-heavy-precheckonto freshorigin/main/compactand use that same active caller budget for the idle-threshold checkReal behavior proof
/compactnow passes the resolved active caller budget instead of stale persisted session metadata, while ordinary queued compactions still use the normal context-window-minus-reserve threshold.origin/mainat PR head7cb398843ec89a587067870496d1f09a8df79a2e, running with the workspace Node/pnpm toolchain.git diff --checknode scripts/test-projects.mjs src/agents/pi-embedded-runner/compact.hooks.test.ts src/auto-reply/reply/commands-compact.test.tsnode scripts/test-projects.mjs src/gateway/origin-check.test.ts src/gateway/server.auth.browser-hardening.test.ts src/pairing/setup-code.test.tspnpm check:test-typespnpm check:changelog-attributions/compactin a live external model session was not run; focused local behavior and regression coverage were run instead.Additional current-head gateway/pairing proof
Behavior or issue addressed: Latest head also tightens Control UI browser-origin handling and mobile pairing cleartext URL validation. Forged non-local loopback origins should be rejected before pairing auth, local loopback should still work, tailnet
ws://mobile pairing should be rejected with the Tailscale Serve guidance, and configured publicwss://pairing should still work.Real environment tested: Local OpenClaw checkout for PR fix(compaction): bypass idle skip at token threshold #73092 on Linux/Node 22 at head
7cb398843ec89a587067870496d1f09a8df79a2e, using productioncheckBrowserOriginandresolvePairingSetupFromConfig.Exact steps or command run after this patch: Ran a
pnpm exec tsx --no-warnings -e '...'proof script from the patched checkout that imported the production helpers and exercised forged loopback origin, local loopback origin, tailnetws://pairing, and publicwss://pairing.Evidence after fix:
Observed result after fix: The forged remote loopback origin is rejected, legitimate local loopback still passes, cleartext tailnet mobile pairing returns the secure Tailscale Serve/Funnel guidance, and
wss://public pairing remains accepted.What was not tested: A browser-driven Control UI login and live mobile app pairing flow were not run; this proof uses the production helper paths and the focused gateway/pairing tests above.
Additional manual compact threshold replay
Behavior or issue addressed: Manual /compact from an external/caller model session must compare observed usage against the active caller context window, not the larger summarizer window or a reserve-adjusted ordinary queued threshold. Ordinary queued compaction must still preserve the summarizer reserve semantics.
Real environment tested: Local OpenClaw checkout for PR fix(compaction): bypass idle skip at token threshold #73092 on Linux/Node 22 at head 7cb3988, executing the production compaction threshold helper through node --import tsx from the patched tree.
Exact steps or command run after this patch: Ran a terminal replay of the manual /compact threshold decision with external caller context 200000, summarizer context 258000, reserve 8000, and the reported high-usage token count shape: 236000 observed.
Evidence after fix (copied terminal output):
PR fix(compaction): bypass idle skip at token threshold #73092 manual /compact threshold replay
external caller model context: 200000 tokens
summarizer context: 258000 tokens; reserve: 8000 tokens
tool-only @199999 observed -> skip: true
tool-only @236000 observed -> skip: false
ordinary queued @249999 observed with 258000-8000 reserve -> skip: true
ordinary queued @250000 observed with 258000-8000 reserve -> skip: false
real conversation @12000 observed -> skip: false
PASS: true
Observed result after fix: The manual /compact high-usage external caller shape does not take the no-real-conversation skip path (skip: false at 236000 observed with caller window 200000), while ordinary queued compaction still subtracts reserve and skips below 250000. Real conversation sessions still bypass this skip as before.
What was not tested: This replay does not call a live provider to summarize the transcript; it proves the production decision gate that previously returned before the compaction call.
Validation
git diff --checknode scripts/test-projects.mjs src/agents/pi-embedded-runner/compact.hooks.test.ts src/auto-reply/reply/commands-compact.test.tsnode scripts/test-projects.mjs src/gateway/origin-check.test.ts src/gateway/server.auth.browser-hardening.test.ts src/pairing/setup-code.test.tspnpm check:test-typespnpm check:changelog-attributions