feat: progressive compaction escalation and mechanical flush fallback#18663
feat: progressive compaction escalation and mechanical flush fallback#18663Adamya05 wants to merge 2 commits intoopenclaw:mainfrom
Conversation
Improves compaction and memory flush robustness for sessions near context limits: - Progressive escalation: overflow compaction retries get increasingly aggressive (maxHistoryShare: 0.4 → 0.25 → 0.1) instead of repeating same params - Headroom pre-check: estimates token usage before flush; skips LLM call if insufficient headroom - Mechanical fallback: when LLM flush fails (context overflow), falls back to non-LLM flush that preserves tracking metadata - Force compaction: flush now triggers immediate compaction to prevent messages between flush and compaction - Structural fallback summary: failed compaction produces useful structural overview (role counts, tools used) instead of generic message - Oversized chunk handling: compaction pre-filters chunks that would exceed model context, preventing summarization failures
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 90b76b118c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| sessionId: nextSessionId, | ||
| totalTokens: 0, | ||
| totalTokensFresh: true, | ||
| compactionCount: nextCompactionCount, | ||
| }), |
There was a problem hiding this comment.
Sync active run with forced session reset
This reset path only patches the persisted session entry, but it does not rotate sessionFile or update followupRun.run.sessionId/sessionFile; runAgentTurnWithFallback immediately below still executes against the old transcript for the current turn. In the forceCompaction scenarios (no flush headroom or flush overflow), the code therefore continues with the pre-reset context and can hit the same overflow again instead of actually starting from a fresh session.
Useful? React with 👍 / 👎.
| setCompactionSafeguardRuntime(sessionManager, { | ||
| maxHistoryShare: params.maxHistoryShareOverride, | ||
| }); |
There was a problem hiding this comment.
Preserve context window when overriding safeguard runtime
This call overwrites the compaction safeguard runtime with only maxHistoryShare. Since the runtime registry set is replace-not-merge, it drops contextWindowTokens that was set earlier in buildEmbeddedExtensionPaths, so overflow retries fall back to the model default window. When users configure a smaller effective context window, retry compactions can under-prune history and keep failing near the limit.
Useful? React with 👍 / 👎.
| overflowCompactionAttempts++; | ||
| // Progressive escalation: each retry gets more aggressive | ||
| // Attempt 1: 0.4, Attempt 2: 0.25, Attempt 3: 0.1 | ||
| const escalationShare = Math.max(0.1, 0.4 - (overflowCompactionAttempts - 1) * 0.15); | ||
| log.warn( | ||
| `context overflow persisted after in-attempt compaction (attempt ${overflowCompactionAttempts}/${MAX_OVERFLOW_COMPACTION_ATTEMPTS}); retrying prompt without additional compaction for ${provider}/${modelId}`, | ||
| `context overflow persisted after in-attempt compaction (attempt ${overflowCompactionAttempts}/${MAX_OVERFLOW_COMPACTION_ATTEMPTS}, maxHistoryShare=${escalationShare}); retrying prompt without additional compaction for ${provider}/${modelId}`, | ||
| ); | ||
| continue; |
There was a problem hiding this comment.
Escalation share computed but never applied
In this branch (where hadAttemptLevelCompaction is true), escalationShare is computed and logged but never passed to any compaction call. The continue retries the prompt, and the next iteration's SDK auto-compaction won't use the escalated maxHistoryShare because there's no mechanism to forward it. In contrast, the second branch (line 604) correctly passes maxHistoryShareOverride: escalationShare to compactEmbeddedPiSessionDirect.
This means the progressive escalation (0.4 → 0.25 → 0.1) only takes effect when the overflow triggers an explicit compaction (second branch), but not when it falls through to a retry with SDK auto-compaction (this branch). The log messages will misleadingly show escalating values that aren't being applied.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/run.ts
Line: 549:556
Comment:
**Escalation share computed but never applied**
In this branch (where `hadAttemptLevelCompaction` is true), `escalationShare` is computed and logged but never passed to any compaction call. The `continue` retries the prompt, and the next iteration's SDK auto-compaction won't use the escalated `maxHistoryShare` because there's no mechanism to forward it. In contrast, the second branch (line 604) correctly passes `maxHistoryShareOverride: escalationShare` to `compactEmbeddedPiSessionDirect`.
This means the progressive escalation (0.4 → 0.25 → 0.1) only takes effect when the overflow triggers an *explicit* compaction (second branch), but not when it falls through to a retry with SDK auto-compaction (this branch). The log messages will misleadingly show escalating values that aren't being applied.
How can I resolve this? If you propose a fix, please make it concise.| // Return a flag or special session entry that triggers compaction | ||
| return { | ||
| ...activeSessionEntry, | ||
| forceCompaction: true, // Custom flag we'll handle in agent-runner.ts | ||
| } as any; |
There was a problem hiding this comment.
forceCompaction flag bypasses type safety
Using as any to smuggle a forceCompaction boolean onto SessionEntry means this flag has no type checking at either the producer (here) or the consumer (agent-runner.ts:215). If the SessionEntry type changes or someone renames this flag, there's no compile-time safety net. Consider adding forceCompaction?: boolean to the SessionEntry type, or returning a discriminated union / wrapper type from runMemoryFlushIfNeeded instead of casting.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/agent-runner-memory.ts
Line: 119:123
Comment:
**`forceCompaction` flag bypasses type safety**
Using `as any` to smuggle a `forceCompaction` boolean onto `SessionEntry` means this flag has no type checking at either the producer (here) or the consumer (`agent-runner.ts:215`). If the `SessionEntry` type changes or someone renames this flag, there's no compile-time safety net. Consider adding `forceCompaction?: boolean` to the `SessionEntry` type, or returning a discriminated union / wrapper type from `runMemoryFlushIfNeeded` instead of casting.
How can I resolve this? If you propose a fix, please make it concise.
src/auto-reply/reply/agent-runner.ts
Outdated
| import { DEFAULT_MODEL, DEFAULT_PROVIDER } from "../../agents/defaults.js"; | ||
| import { resolveOpenClawAgentDir } from "../../agents/agent-paths.js"; |
There was a problem hiding this comment.
Unused imports
DEFAULT_MODEL, DEFAULT_PROVIDER, and resolveOpenClawAgentDir are imported but never used in this file. These will likely cause lint errors.
| import { DEFAULT_MODEL, DEFAULT_PROVIDER } from "../../agents/defaults.js"; | |
| import { resolveOpenClawAgentDir } from "../../agents/agent-paths.js"; |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/agent-runner.ts
Line: 13:14
Comment:
**Unused imports**
`DEFAULT_MODEL`, `DEFAULT_PROVIDER`, and `resolveOpenClawAgentDir` are imported but never used in this file. These will likely cause lint errors.
```suggestion
```
How can I resolve this? If you propose a fix, please make it concise.| fallbacksOverride: resolveAgentModelFallbacksOverride( | ||
| params.followupRun.run.config, | ||
| resolveAgentIdFromSessionKey(params.followupRun.run.sessionKey), | ||
| ), | ||
| params.followupRun.run.config, | ||
| resolveAgentIdFromSessionKey(params.followupRun.run.sessionKey), | ||
| ), |
There was a problem hiding this comment.
Indentation regression
The arguments to resolveAgentModelFallbacksOverride are now over-indented (12 spaces instead of the original 8). The pre-PR code had correct alignment:
| fallbacksOverride: resolveAgentModelFallbacksOverride( | |
| params.followupRun.run.config, | |
| resolveAgentIdFromSessionKey(params.followupRun.run.sessionKey), | |
| ), | |
| params.followupRun.run.config, | |
| resolveAgentIdFromSessionKey(params.followupRun.run.sessionKey), | |
| ), | |
| params.followupRun.run.config, | |
| resolveAgentIdFromSessionKey(params.followupRun.run.sessionKey), | |
| ), |
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/agent-runner-memory.ts
Line: 146:149
Comment:
**Indentation regression**
The arguments to `resolveAgentModelFallbacksOverride` are now over-indented (12 spaces instead of the original 8). The pre-PR code had correct alignment:
```suggestion
params.followupRun.run.config,
resolveAgentIdFromSessionKey(params.followupRun.run.sessionKey),
),
```
<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>
How can I resolve this? If you propose a fix, please make it concise.| if (typeof params.maxHistoryShareOverride === "number") { | ||
| setCompactionSafeguardRuntime(sessionManager, { | ||
| maxHistoryShare: params.maxHistoryShareOverride, | ||
| }); | ||
| } |
There was a problem hiding this comment.
Override clobbers contextWindowTokens runtime value
setCompactionSafeguardRuntime replaces the entire registry entry (it calls WeakMap.set, not a merge). buildEmbeddedExtensionPaths (line 533) already set { maxHistoryShare, contextWindowTokens } on this sessionManager. This call replaces it with { maxHistoryShare: params.maxHistoryShareOverride }, dropping contextWindowTokens.
In compaction-safeguard.ts:200, the consumer falls back to modelContextWindow when runtime?.contextWindowTokens is missing, so this won't crash — but any config-level context window override (compaction.contextWindowTokens) is silently lost during progressive escalation.
Fix: merge with the existing runtime value instead of replacing it:
| if (typeof params.maxHistoryShareOverride === "number") { | |
| setCompactionSafeguardRuntime(sessionManager, { | |
| maxHistoryShare: params.maxHistoryShareOverride, | |
| }); | |
| } | |
| if (typeof params.maxHistoryShareOverride === "number") { | |
| const existing = getCompactionSafeguardRuntime(sessionManager); | |
| setCompactionSafeguardRuntime(sessionManager, { | |
| ...existing, | |
| maxHistoryShare: params.maxHistoryShareOverride, | |
| }); | |
| } |
This requires importing getCompactionSafeguardRuntime alongside setCompactionSafeguardRuntime.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/compact.ts
Line: 542:546
Comment:
**Override clobbers `contextWindowTokens` runtime value**
`setCompactionSafeguardRuntime` replaces the entire registry entry (it calls `WeakMap.set`, not a merge). `buildEmbeddedExtensionPaths` (line 533) already set `{ maxHistoryShare, contextWindowTokens }` on this `sessionManager`. This call replaces it with `{ maxHistoryShare: params.maxHistoryShareOverride }`, dropping `contextWindowTokens`.
In `compaction-safeguard.ts:200`, the consumer falls back to `modelContextWindow` when `runtime?.contextWindowTokens` is missing, so this won't crash — but any config-level context window override (`compaction.contextWindowTokens`) is silently lost during progressive escalation.
Fix: merge with the existing runtime value instead of replacing it:
```suggestion
if (typeof params.maxHistoryShareOverride === "number") {
const existing = getCompactionSafeguardRuntime(sessionManager);
setCompactionSafeguardRuntime(sessionManager, {
...existing,
maxHistoryShare: params.maxHistoryShareOverride,
});
}
```
This requires importing `getCompactionSafeguardRuntime` alongside `setCompactionSafeguardRuntime`.
How can I resolve this? If you propose a fix, please make it concise.| try { | ||
| logVerbose( | ||
| `mechanical flush: trimming old turns for session ${params.sessionKey ?? "(unknown)"}`, | ||
| ); | ||
| // The mechanical flush is intentionally a no-op placeholder for now. | ||
| // The real value is that it marks memoryFlushAt so the system moves on | ||
| // to compaction (which does the actual size reduction). A future | ||
| // iteration can implement actual turn trimming here. | ||
| return { success: true }; | ||
| } catch (err) { | ||
| const msg = err instanceof Error ? err.message : String(err); | ||
| return { success: false, error: msg }; | ||
| } |
There was a problem hiding this comment.
Misleading JSDoc for a no-op function
The JSDoc says "Perform a mechanical (non-LLM) flush by trimming old messages" and "we simply drop the oldest assistant/user turns", but the function body is a no-op that always returns { success: true }. The internal comment explains this, but callers like runMechanicalFlushWithTracking log "mechanical flush fallback completed" which implies real work was done. The catch block (lines 43-45) is also unreachable since logVerbose won't throw.
Consider updating the doc comment to match the actual behavior (placeholder/stub), or annotating the return type to indicate no messages were actually trimmed.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/mechanical-flush.ts
Line: 34:46
Comment:
**Misleading JSDoc for a no-op function**
The JSDoc says "Perform a mechanical (non-LLM) flush by trimming old messages" and "we simply drop the oldest assistant/user turns", but the function body is a no-op that always returns `{ success: true }`. The internal comment explains this, but callers like `runMechanicalFlushWithTracking` log "mechanical flush fallback completed" which implies real work was done. The `catch` block (lines 43-45) is also unreachable since `logVerbose` won't throw.
Consider updating the doc comment to match the actual behavior (placeholder/stub), or annotating the return type to indicate no messages were actually trimmed.
How can I resolve this? If you propose a fix, please make it concise.
src/auto-reply/reply/agent-runner.ts
Outdated
| if ((activeSessionEntry as any)?.forceCompaction && sessionKey && storePath) { | ||
| logVerbose(`memory flush: force compaction flagged, resetting session context`); | ||
| const nextSessionId = crypto.randomUUID(); | ||
| const nextCompactionCount = (activeSessionEntry?.compactionCount ?? 0) + 1; | ||
| delete (activeSessionEntry as any).forceCompaction; | ||
| try { | ||
| const updatedEntry = await updateSessionStoreEntry({ | ||
| storePath, | ||
| sessionKey, | ||
| update: async () => ({ | ||
| sessionId: nextSessionId, | ||
| totalTokens: 0, | ||
| totalTokensFresh: true, | ||
| compactionCount: nextCompactionCount, | ||
| }), | ||
| }); | ||
| if (updatedEntry) { | ||
| activeSessionEntry = updatedEntry; | ||
| } | ||
| } catch (err) { | ||
| logVerbose(`force compaction session reset failed: ${String(err)}`); | ||
| } | ||
| } |
There was a problem hiding this comment.
Force compaction doesn't update followupRun session references
This block creates a new sessionId and persists it to the store, but never updates followupRun.run.sessionId or followupRun.run.sessionFile. The subsequent runAgentTurnWithFallback call at line 332 passes followupRun, so the agent turn will use the old session ID and file — causing a mismatch between the store (new session) and the actual run (old session).
Compare with the existing resetSession helper (lines 295-296) which correctly updates both:
followupRun.run.sessionId = nextSessionId;
followupRun.run.sessionFile = nextSessionFile;This block also doesn't resolve a proper sessionFile path (via resolveSessionTranscriptPath) or update activeSessionStore[sessionKey], which resetSession also does. Consider reusing resetSession or at minimum mirroring its updates to followupRun.run.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/agent-runner.ts
Line: 215:237
Comment:
**Force compaction doesn't update `followupRun` session references**
This block creates a new `sessionId` and persists it to the store, but never updates `followupRun.run.sessionId` or `followupRun.run.sessionFile`. The subsequent `runAgentTurnWithFallback` call at line 332 passes `followupRun`, so the agent turn will use the **old** session ID and file — causing a mismatch between the store (new session) and the actual run (old session).
Compare with the existing `resetSession` helper (lines 295-296) which correctly updates both:
```ts
followupRun.run.sessionId = nextSessionId;
followupRun.run.sessionFile = nextSessionFile;
```
This block also doesn't resolve a proper `sessionFile` path (via `resolveSessionTranscriptPath`) or update `activeSessionStore[sessionKey]`, which `resetSession` also does. Consider reusing `resetSession` or at minimum mirroring its updates to `followupRun.run`.
How can I resolve this? If you propose a fix, please make it concise.| const chunkTokens = estimateMessagesTokens(chunk) * SAFETY_MARGIN; | ||
| if (chunkTokens > effectiveContextWindow * 0.9) { |
There was a problem hiding this comment.
Double safety margin applied to chunk token estimate
estimateMessagesTokens already applies SAFETY_MARGIN internally (line 18-19 strips details, then line 19 sums estimateTokens per message). Then this line multiplies by SAFETY_MARGIN again:
const chunkTokens = estimateMessagesTokens(chunk) * SAFETY_MARGIN;This inflates the estimate by ~44% (1.2 × 1.2) rather than the intended 20%. Check whether the outer * SAFETY_MARGIN is intentional or whether estimateMessagesTokens already accounts for the buffer. If double-application is intentional (as a conservative safeguard), a comment would help clarify.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/compaction.ts
Line: 168:169
Comment:
**Double safety margin applied to chunk token estimate**
`estimateMessagesTokens` already applies `SAFETY_MARGIN` internally (line 18-19 strips details, then line 19 sums `estimateTokens` per message). Then this line multiplies by `SAFETY_MARGIN` again:
```ts
const chunkTokens = estimateMessagesTokens(chunk) * SAFETY_MARGIN;
```
This inflates the estimate by ~44% (1.2 × 1.2) rather than the intended 20%. Check whether the outer `* SAFETY_MARGIN` is intentional or whether `estimateMessagesTokens` already accounts for the buffer. If double-application is intentional (as a conservative safeguard), a comment would help clarify.
How can I resolve this? If you propose a fix, please make it concise.- P1: sync followupRun.sessionId/sessionFile after forceCompaction reset so the main turn uses the fresh session instead of old transcript - P2: merge safeguard runtime (preserve contextWindowTokens) instead of replacing when setting maxHistoryShare override - Fix escalation in hadAttemptLevelCompaction branch: run explicit compaction with escalated share instead of continue with no effect - Add forceCompaction to SessionEntry type, remove 'as any' casts - Remove unused imports (DEFAULT_MODEL, DEFAULT_PROVIDER, resolveOpenClawAgentDir) - Fix indentation regression in agent-runner-memory.ts
|
This pull request has been automatically marked as stale due to inactivity. |
Improves compaction and memory flush robustness for sessions near context limits:
All changes are additive — no existing behavior is modified, only new fallback/escalation paths added.
Greptile Summary
This PR adds progressive compaction escalation and mechanical flush fallback paths for sessions near context limits. The changes span 7 files and introduce several resilience mechanisms: headroom pre-checks before LLM flush, mechanical (non-LLM) flush fallback, force compaction after flush, structural fallback summaries for failed compaction, and oversized chunk handling.
agent-runner.tsdoesn't updatefollowupRun.runreferences — the new session ID is persisted to the store butfollowupRun.run.sessionIdandfollowupRun.run.sessionFileremain stale, causing the subsequent agent turn to target the wrong session. Compare with the existingresetSessionhelper which correctly updates these fields.as anycasts forforceCompaction,setCompactionSafeguardRuntimereplacing rather than merging runtime state, unused imports, misleading JSDoc) — these remain unaddressed in the current revision.mechanical-flush.tsis an intentional no-op stub that relies on theforceCompactionflag to trigger actual compaction downstream.compaction-safeguard.tsand oversized chunk handling incompaction.tsare clean additions that improve resilience.Confidence Score: 2/5
src/auto-reply/reply/agent-runner.ts(stale session references after force compaction) andsrc/agents/pi-embedded-runner/run.ts(escalation share computed but not applied in first branch).Last reviewed commit: 90b76b1