Skip to content

fix(agents): log pre-prompt compaction fits decisions#84676

Merged
giodl73-repo merged 1 commit into
mainfrom
fix/68609-precheck-budget-log
May 21, 2026
Merged

fix(agents): log pre-prompt compaction fits decisions#84676
giodl73-repo merged 1 commit into
mainfrom
fix/68609-precheck-budget-log

Conversation

@giodl73-repo

@giodl73-repo giodl73-repo commented May 20, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Problem: the pre-prompt auto-compaction precheck only logged recovery routes, so a normal route=fits decision looked the same as the precheck not running.
  • Solution: format and emit one debug-level pre-prompt precheck line for every decision route, including fits.
  • What changed: the log now includes route, estimated prompt tokens, prompt budget before reserve, overflow tokens, reducible tool-result chars, reserve/effective reserve, context budget, message counts, provider/model, and session identity.
  • What did NOT change: no compaction thresholds, pruning behavior, cache-ttl policy, model routing, or payload policy changed.

Motivation

#68609 is an observability gap: the budget decision already exists at the pre-prompt call site, but the normal fits path previously fell through silently. This is budget stack PR 1: it makes the decision visible before later PRs persist/report/classify budget state.

Linked Issue/PR

Real behavior proof (required for external PRs)

Behavior addressed: pre-prompt compaction precheck decisions now produce a debug-formatted diagnostic line even when the route is fits, with the token fields requested in #68609.

Real environment tested: local Windows source worktree on Node v24.14.0, rebased on openclaw/main at c0312748c4.

Exact steps or command run after this patch:

  • node --version
  • git diff --check openclaw/main...HEAD
  • node node_modules\vitest\vitest.mjs run src/agents/pi-embedded-runner/run/preemptive-compaction.test.ts --reporter=verbose
  • node_modules\.bin\oxfmt.CMD --check src/agents/pi-embedded-runner/run/attempt.ts src/agents/pi-embedded-runner/run/preemptive-compaction.ts src/agents/pi-embedded-runner/run/preemptive-compaction.test.ts
  • node scripts/run-tsgo.mjs -p tsconfig.core.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/pr84676.tsbuildinfo
  • node --import tsx -e "const { shouldPreemptivelyCompactBeforePrompt, formatPrePromptPrecheckLog } = await import('./src/agents/pi-embedded-runner/run/preemptive-compaction.ts'); const result = shouldPreemptivelyCompactBeforePrompt({ messages: [{ role: 'assistant', content: [{ type: 'text', text: 'short history' }], timestamp: 1 }], systemPrompt: 'sys', prompt: 'hello', contextTokenBudget: 10000, reserveTokens: 1000 }); console.log(formatPrePromptPrecheckLog({ result, sessionKey: 'agent:main:main', sessionId: 'redacted-session', provider: 'anthropic', modelId: 'claude-opus-4-6', messageCount: 1, unwindowedMessageCount: 1, contextTokenBudget: 10000, reserveTokens: 1000, sessionFile: 'redacted-session.jsonl' }));"

Evidence after fix:

node --version
v24.14.0

git diff --check openclaw/main...HEAD
# passed

node node_modules\vitest\vitest.mjs run src/agents/pi-embedded-runner/run/preemptive-compaction.test.ts --reporter=verbose
# passed: Test Files 2 passed (2), Tests 22 passed (22)

node_modules\.bin\oxfmt.CMD --check src/agents/pi-embedded-runner/run/attempt.ts src/agents/pi-embedded-runner/run/preemptive-compaction.ts src/agents/pi-embedded-runner/run/preemptive-compaction.test.ts
# All matched files use the correct format.

node scripts/run-tsgo.mjs -p tsconfig.core.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/pr84676.tsbuildinfo
# passed

[context-overflow-precheck] pre-prompt check sessionKey=agent:main:main provider=anthropic/claude-opus-4-6 route=fits estimatedPromptTokens=8 promptBudgetBeforeReserve=9000 overflowTokens=0 toolResultReducibleChars=0 reserveTokens=1000 effectiveReserveTokens=1000 contextTokenBudget=10000 messages=1 unwindowedMessages=1 sessionFile=redacted-session.jsonl

Observed result after fix: route=fits now produces the same diagnostic namespace as recovery routes, with the fields needed to distinguish “precheck ran and fits” from “precheck did not run”. The call site invokes the formatter for every non-skipped precheck result before the existing recovery branches.

What was not tested: a long-running live Anthropic Opus/cache-ttl session was not reproduced; this PR intentionally adds diagnostics before changing heuristic or pruning policy. Broad CI is rerunning on the rebased SHA; the previous red shard was checks-node-core-runtime-infra-state in unrelated src/infra/secret-file.test.ts.

Root Cause

The pre-prompt call site only logged truncation and compaction branches. A fits result continued directly to prompt submission without a diagnostic, unlike the mid-turn precheck path where all routes were already logged.

Regression Test Plan

  • Coverage level: unit/helper test
  • Target test: src/agents/pi-embedded-runner/run/preemptive-compaction.test.ts
  • Scenario: formatting a fits pre-prompt precheck diagnostic includes route and budget fields.
  • Why this guardrail is scoped: the runner path is expensive; the formatter is production code and the call site now invokes it for every precheck result.

User-visible / Behavior Changes

Debug logs include an additional [context-overflow-precheck] pre-prompt check ... route=fits line when debug logging is enabled. Runtime behavior is unchanged.

Security Impact

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No
  • Command/tool execution surface changed? No
  • Data access scope changed? No

Risks and Mitigations

  • Risk: extra debug line could be noisy in debug mode.
    • Mitigation: it is debug-level only and reuses the existing context-overflow precheck log namespace.

@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: S maintainer Maintainer-authored PR labels May 20, 2026
@clawsweeper

clawsweeper Bot commented May 20, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge.

Workflow note: Future ClawSweeper reviews update this same comment in place.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

Summary
The PR adds a production formatter and debug call-site log for every pre-prompt compaction precheck route, plus focused coverage for the route=fits diagnostic.

Reproducibility: yes. by source inspection: current main runs the pre-prompt precheck but only logs truncation and compaction recovery branches, so a fits result has no diagnostic path. I did not run a live agent session in this read-only review.

PR rating
Overall: 🐚 platinum hermit
Proof: 🐚 platinum hermit
Patch quality: 🦞 diamond lobster
Summary: Small, well-scoped diagnostic patch with focused coverage, green public checks, and adequate terminal proof; remaining work is maintainer review.

Rank-up moves:

  • none
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

Real behavior proof
Sufficient (terminal): The PR body and follow-up comment include a redacted terminal excerpt from the patched production formatter showing the route=fits diagnostic line, and the public Real behavior proof check is green.

Risk before merge

  • The PR has the protected maintainer label, so it needs explicit maintainer handling even though no blocking code finding was found.
  • The new line adds debug noise and includes session identity/sessionFile for fits decisions; this matches the requested diagnostic surface but should remain a maintainer-accepted observability choice.

Maintainer options:

  1. Decide the mitigation before merge
    Land this diagnostic-only change after maintainer review, keeping compaction thresholds, pruning behavior, and budget policy changes out of this PR.
  2. Pause or close
    Do not merge this PR until maintainers decide whether the risk is worth taking.

Next step before merge
No narrow automated repair is needed; the remaining action is explicit maintainer review/merge handling for a protected-label PR.

Security
Cleared: The diff only adds debug-string formatting, a debug log call, and unit coverage; it adds no dependency, workflow, permission, network, secret, or code-execution surface.

Review details

Best possible solution:

Land this diagnostic-only change after maintainer review, keeping compaction thresholds, pruning behavior, and budget policy changes out of this PR.

Do we have a high-confidence way to reproduce the issue?

Yes by source inspection: current main runs the pre-prompt precheck but only logs truncation and compaction recovery branches, so a fits result has no diagnostic path. I did not run a live agent session in this read-only review.

Is this the best way to solve the issue?

Yes: adding a production formatter plus one call-site debug log is the narrow maintainable fix, and the scoped helper test follows the embedded-runner guidance for this expensive path.

Label changes:

  • add proof: sufficient: Contributor real behavior proof is sufficient. The PR body and follow-up comment include a redacted terminal excerpt from the patched production formatter showing the route=fits diagnostic line, and the public Real behavior proof check is green.
  • add rating: 🐚 platinum hermit: Current PR rating is 🐚 platinum hermit because proof is 🐚 platinum hermit, patch quality is 🦞 diamond lobster, and Small, well-scoped diagnostic patch with focused coverage, green public checks, and adequate terminal proof; remaining work is maintainer review.
  • add status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (terminal): The PR body and follow-up comment include a redacted terminal excerpt from the patched production formatter showing the route=fits diagnostic line, and the public Real behavior proof check is green.
  • remove rating: 🦪 silver shellfish: Current PR rating is rating: 🐚 platinum hermit, so this older rating label is no longer current.
  • remove status: 📣 needs proof: Current PR status label is status: 👀 ready for maintainer look.

Label justifications:

  • P2: This is a normal-priority agent observability fix with limited blast radius and no config, migration, or compaction-policy change.
  • rating: 🐚 platinum hermit: Current PR rating is 🐚 platinum hermit because proof is 🐚 platinum hermit, patch quality is 🦞 diamond lobster, and Small, well-scoped diagnostic patch with focused coverage, green public checks, and adequate terminal proof; remaining work is maintainer review.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (terminal): The PR body and follow-up comment include a redacted terminal excerpt from the patched production formatter showing the route=fits diagnostic line, and the public Real behavior proof check is green.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body and follow-up comment include a redacted terminal excerpt from the patched production formatter showing the route=fits diagnostic line, and the public Real behavior proof check is green.

What I checked:

Likely related people:

  • Takhoffman: Commits by Tak Hoffman restored the reserve-based overflow precheck and refined cause-aware routing across the same runner, helper, and tests. (role: introduced behavior; confidence: high; commits: 3e2a05f4251f, 66daafccae09; files: src/agents/pi-embedded-runner/run/attempt.ts, src/agents/pi-embedded-runner/run/preemptive-compaction.ts, src/agents/pi-embedded-runner/run/preemptive-compaction.test.ts)
  • Dallin Romney: Current line blame in this checkout points at a recent commit that carried the current precheck call site and helper files forward on main. (role: recent area contributor; confidence: medium; commits: 9a6744baba6e; files: src/agents/pi-embedded-runner/run/attempt.ts, src/agents/pi-embedded-runner/run/preemptive-compaction.ts)

Codex review notes: model gpt-5.5, reasoning high; reviewed against 46030f548986.

@clawsweeper clawsweeper Bot added rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P2 Normal backlog priority with limited blast radius. labels May 20, 2026
@clawsweeper

clawsweeper Bot commented May 20, 2026

Copy link
Copy Markdown
Contributor

ClawSweeper PR egg

✨ Hatched: 🥚 common Brave Signal Puff

Hatch command

Comment @clawsweeper hatch when this PR is hatchable.

Hatchability rules:

  • Merged PRs are hatchable.
  • Open PRs are hatchable when they are status: 👀 ready for maintainer look, status: 🚀 automerge armed, or labeled clawsweeper:automerge.
  • Closed unmerged PRs are hatchable only when one of those hatchable labels is still present in the durable record.

Rarity: 🥚 common.
Trait: collects tiny proofs.
Image traits: location branch lighthouse; accessory tiny test log scroll; palette coral, mint, and warm cream; mood sleepy but ready; pose holding its accessory up for inspection; shell smooth pearl shell; lighting golden review-room light; background miniature CI buoys.
Share on X: post this hatch
Copy: My PR egg hatched a 🥚 common Brave Signal Puff in ClawSweeper.

What is this egg doing here?
  • Eggs appear after the PR passes real-behavior proof. It is here for vibes, not verdicts: it does not change labels, ratings, merge decisions, or automation.
  • The shell reacts to review momentum: open follow-up work warms it up, re-review makes it wobble, and a clean final review lets it hatch.
  • Hatchability usually comes from sufficient real-behavior proof, no blocking P0/P1/P2 findings, no security attention needed, and clean correctness. A merged PR is already final, so merge makes the egg hatchable independently.
  • The hatch is seeded from this repository and PR number, so the same PR keeps the same creature; the reviewed head SHA can only change safe visual details.
  • Rarity is just collectible sparkle: 🥚 common, 🌱 uncommon, 💎 rare, ✨ glimmer, and 🌈 legendary.

@giodl73-repo giodl73-repo force-pushed the fix/68609-precheck-budget-log branch from 5551732 to d003bc2 Compare May 20, 2026 20:52
@giodl73-repo

Copy link
Copy Markdown
Contributor Author

Proof refresh for #84676 at d003bc28a9c7d9e1a46a79cb5f8fef2255e55a15:

  • node --version: v24.14.0
  • git diff --check HEAD^..HEAD: passed
  • node node_modules\vitest\vitest.mjs run src/agents/pi-embedded-runner/run/preemptive-compaction.test.ts: passed, 2 files / 22 tests
  • pnpm exec oxfmt --check --threads=1 src/agents/pi-embedded-runner/run/attempt.ts src/agents/pi-embedded-runner/run/preemptive-compaction.ts src/agents/pi-embedded-runner/run/preemptive-compaction.test.ts: passed, 3 files
  • node scripts/run-tsgo.mjs -p tsconfig.core.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/core.tsbuildinfo: passed

CI status note: Real behavior proof is passing. I reran the failed workflow jobs; the remaining red checks-node-core-runtime-infra-state shard is still failing in unrelated src/infra/secret-file.test.ts (throws from the try helper for rejected files). This PR only touches src/agents/pi-embedded-runner/run/attempt.ts, src/agents/pi-embedded-runner/run/preemptive-compaction.ts, and src/agents/pi-embedded-runner/run/preemptive-compaction.test.ts.

@giodl73-repo giodl73-repo force-pushed the fix/68609-precheck-budget-log branch from d003bc2 to fc5969f Compare May 21, 2026 04:19
@giodl73-repo

Copy link
Copy Markdown
Contributor Author

Updated PR 1 on the actual openclaw/openclaw head branch.

  • Rebased fix/68609-precheck-budget-log onto current openclaw/main (c0312748c4), new head fc5969ff44821fe722c51f925ffe9665baf19c38.
  • Refreshed the PR body with concrete terminal proof of the new redacted route=fits diagnostic line.
  • Focused proof on the rebased SHA:
    • node --version: v24.14.0
    • git diff --check openclaw/main...HEAD: passed
    • node node_modules\vitest\vitest.mjs run src/agents/pi-embedded-runner/run/preemptive-compaction.test.ts --reporter=verbose: passed, 2 files / 22 tests
    • node_modules\.bin\oxfmt.CMD --check src/agents/pi-embedded-runner/run/attempt.ts src/agents/pi-embedded-runner/run/preemptive-compaction.ts src/agents/pi-embedded-runner/run/preemptive-compaction.test.ts: passed
    • node scripts/run-tsgo.mjs -p tsconfig.core.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/pr84676.tsbuildinfo: passed

Redacted diagnostic excerpt from the production formatter after this patch:

[context-overflow-precheck] pre-prompt check sessionKey=agent:main:main provider=anthropic/claude-opus-4-6 route=fits estimatedPromptTokens=8 promptBudgetBeforeReserve=9000 overflowTokens=0 toolResultReducibleChars=0 reserveTokens=1000 effectiveReserveTokens=1000 contextTokenBudget=10000 messages=1 unwindowedMessages=1 sessionFile=redacted-session.jsonl

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented May 21, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. and removed rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels May 21, 2026
@giodl73-repo giodl73-repo merged commit 79be940 into main May 21, 2026
131 of 135 checks passed
@giodl73-repo giodl73-repo deleted the fix/68609-precheck-budget-log branch May 21, 2026 04:53
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 24, 2026
Co-authored-by: Gio Della-Libera <giodl@microsoft.com>
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 24, 2026
Co-authored-by: Gio Della-Libera <giodl@microsoft.com>
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 24, 2026
Co-authored-by: Gio Della-Libera <giodl@microsoft.com>
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 24, 2026
Co-authored-by: Gio Della-Libera <giodl@microsoft.com>
galiniliev pushed a commit to galiniliev/openclaw that referenced this pull request May 25, 2026
Co-authored-by: Gio Della-Libera <giodl@microsoft.com>
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 26, 2026
Co-authored-by: Gio Della-Libera <giodl@microsoft.com>
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 26, 2026
Co-authored-by: Gio Della-Libera <giodl@microsoft.com>
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 26, 2026
Co-authored-by: Gio Della-Libera <giodl@microsoft.com>
jameslcowan pushed a commit to jameslcowan/openclaw that referenced this pull request Jun 2, 2026
Co-authored-by: Gio Della-Libera <giodl@microsoft.com>
SYU8384 pushed a commit to SYU8384/openclaw that referenced this pull request Jun 3, 2026
Co-authored-by: Gio Della-Libera <giodl@microsoft.com>
sablehead pushed a commit to sablehead/openclaw that referenced this pull request Jun 10, 2026
Co-authored-by: Gio Della-Libera <giodl@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling maintainer Maintainer-authored PR P2 Normal backlog priority with limited blast radius. proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. size: S status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Auto-compaction preemptive check: no debug logging when route='fits'; cache-ttl pruning may mask estimate for large-context models

1 participant