fix(agents): log pre-prompt compaction fits decisions#84676
Conversation
|
Codex review: needs maintainer review before merge. Workflow note: Future ClawSweeper reviews update this same comment in place. How this review workflow works
Summary Reproducibility: yes. by source inspection: current main runs the pre-prompt precheck but only logs truncation and compaction recovery branches, so a PR rating Rank-up moves:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. Real behavior proof Risk before merge
Maintainer options:
Next step before merge Security Review detailsBest possible solution: Land this diagnostic-only change after maintainer review, keeping compaction thresholds, pruning behavior, and budget policy changes out of this PR. Do we have a high-confidence way to reproduce the issue? Yes by source inspection: current main runs the pre-prompt precheck but only logs truncation and compaction recovery branches, so a Is this the best way to solve the issue? Yes: adding a production formatter plus one call-site debug log is the narrow maintainable fix, and the scoped helper test follows the embedded-runner guidance for this expensive path. Label changes:
Label justifications:
What I checked:
Likely related people:
Codex review notes: model gpt-5.5, reasoning high; reviewed against 46030f548986. |
|
ClawSweeper PR egg ✨ Hatched: 🥚 common Brave Signal Puff Hatch commandComment Hatchability rules:
Rarity: 🥚 common. What is this egg doing here?
|
5551732 to
d003bc2
Compare
|
Proof refresh for #84676 at
CI status note: |
d003bc2 to
fc5969f
Compare
|
Updated PR 1 on the actual
Redacted diagnostic excerpt from the production formatter after this patch: @clawsweeper re-review |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
Co-authored-by: Gio Della-Libera <giodl@microsoft.com>
Co-authored-by: Gio Della-Libera <giodl@microsoft.com>
Co-authored-by: Gio Della-Libera <giodl@microsoft.com>
Co-authored-by: Gio Della-Libera <giodl@microsoft.com>
Co-authored-by: Gio Della-Libera <giodl@microsoft.com>
Co-authored-by: Gio Della-Libera <giodl@microsoft.com>
Co-authored-by: Gio Della-Libera <giodl@microsoft.com>
Co-authored-by: Gio Della-Libera <giodl@microsoft.com>
Co-authored-by: Gio Della-Libera <giodl@microsoft.com>
Co-authored-by: Gio Della-Libera <giodl@microsoft.com>
Co-authored-by: Gio Della-Libera <giodl@microsoft.com>
Summary
route=fitsdecision looked the same as the precheck not running.fits.Motivation
#68609 is an observability gap: the budget decision already exists at the pre-prompt call site, but the normal
fitspath previously fell through silently. This is budget stack PR 1: it makes the decision visible before later PRs persist/report/classify budget state.Linked Issue/PR
Real behavior proof (required for external PRs)
Behavior addressed: pre-prompt compaction precheck decisions now produce a debug-formatted diagnostic line even when the route is
fits, with the token fields requested in #68609.Real environment tested: local Windows source worktree on Node
v24.14.0, rebased onopenclaw/mainatc0312748c4.Exact steps or command run after this patch:
node --versiongit diff --check openclaw/main...HEADnode node_modules\vitest\vitest.mjs run src/agents/pi-embedded-runner/run/preemptive-compaction.test.ts --reporter=verbosenode_modules\.bin\oxfmt.CMD --check src/agents/pi-embedded-runner/run/attempt.ts src/agents/pi-embedded-runner/run/preemptive-compaction.ts src/agents/pi-embedded-runner/run/preemptive-compaction.test.tsnode scripts/run-tsgo.mjs -p tsconfig.core.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/pr84676.tsbuildinfonode --import tsx -e "const { shouldPreemptivelyCompactBeforePrompt, formatPrePromptPrecheckLog } = await import('./src/agents/pi-embedded-runner/run/preemptive-compaction.ts'); const result = shouldPreemptivelyCompactBeforePrompt({ messages: [{ role: 'assistant', content: [{ type: 'text', text: 'short history' }], timestamp: 1 }], systemPrompt: 'sys', prompt: 'hello', contextTokenBudget: 10000, reserveTokens: 1000 }); console.log(formatPrePromptPrecheckLog({ result, sessionKey: 'agent:main:main', sessionId: 'redacted-session', provider: 'anthropic', modelId: 'claude-opus-4-6', messageCount: 1, unwindowedMessageCount: 1, contextTokenBudget: 10000, reserveTokens: 1000, sessionFile: 'redacted-session.jsonl' }));"Evidence after fix:
Observed result after fix:
route=fitsnow produces the same diagnostic namespace as recovery routes, with the fields needed to distinguish “precheck ran and fits” from “precheck did not run”. The call site invokes the formatter for every non-skipped precheck result before the existing recovery branches.What was not tested: a long-running live Anthropic Opus/cache-ttl session was not reproduced; this PR intentionally adds diagnostics before changing heuristic or pruning policy. Broad CI is rerunning on the rebased SHA; the previous red shard was
checks-node-core-runtime-infra-statein unrelatedsrc/infra/secret-file.test.ts.Root Cause
The pre-prompt call site only logged truncation and compaction branches. A
fitsresult continued directly to prompt submission without a diagnostic, unlike the mid-turn precheck path where all routes were already logged.Regression Test Plan
src/agents/pi-embedded-runner/run/preemptive-compaction.test.tsfitspre-prompt precheck diagnostic includes route and budget fields.User-visible / Behavior Changes
Debug logs include an additional
[context-overflow-precheck] pre-prompt check ... route=fitsline when debug logging is enabled. Runtime behavior is unchanged.Security Impact
Risks and Mitigations