Skip to content

feat: progressive compaction escalation and mechanical flush fallback#18663

Open
Adamya05 wants to merge 2 commits intoopenclaw:mainfrom
Adamya05:feat/progressive-compaction-escalation
Open

feat: progressive compaction escalation and mechanical flush fallback#18663
Adamya05 wants to merge 2 commits intoopenclaw:mainfrom
Adamya05:feat/progressive-compaction-escalation

Conversation

@Adamya05
Copy link

@Adamya05 Adamya05 commented Feb 16, 2026

Improves compaction and memory flush robustness for sessions near context limits:

  • Progressive escalation: Overflow compaction retries get increasingly aggressive (maxHistoryShare: 0.4 → 0.25 → 0.1) instead of repeating the same params
  • Headroom pre-check: Estimates token usage before flush; skips LLM call if insufficient headroom
  • Mechanical fallback: When LLM flush fails (context overflow), falls back to non-LLM flush that preserves tracking metadata
  • Force compaction: Flush now triggers immediate compaction to prevent messages sneaking in between flush and compaction
  • Structural fallback summary: Failed compaction summarization produces a useful structural overview (role counts, tools used) instead of a generic message
  • Oversized chunk handling: Compaction pre-filters chunks that would exceed model context, preventing summarization failures

All changes are additive — no existing behavior is modified, only new fallback/escalation paths added.

Greptile Summary

This PR adds progressive compaction escalation and mechanical flush fallback paths for sessions near context limits. The changes span 7 files and introduce several resilience mechanisms: headroom pre-checks before LLM flush, mechanical (non-LLM) flush fallback, force compaction after flush, structural fallback summaries for failed compaction, and oversized chunk handling.

  • Critical: Force compaction in agent-runner.ts doesn't update followupRun.run references — the new session ID is persisted to the store but followupRun.run.sessionId and followupRun.run.sessionFile remain stale, causing the subsequent agent turn to target the wrong session. Compare with the existing resetSession helper which correctly updates these fields.
  • The prior review flagged several issues (escalation share computed but not applied in the first branch, as any casts for forceCompaction, setCompactionSafeguardRuntime replacing rather than merging runtime state, unused imports, misleading JSDoc) — these remain unaddressed in the current revision.
  • New mechanical-flush.ts is an intentional no-op stub that relies on the forceCompaction flag to trigger actual compaction downstream.
  • Structural fallback in compaction-safeguard.ts and oversized chunk handling in compaction.ts are clean additions that improve resilience.

Confidence Score: 2/5

  • The force compaction session reset bug in agent-runner.ts could cause agents to write to stale sessions, leading to data loss or orphaned sessions in production.
  • Score of 2 reflects a critical logic bug where forceCompaction creates a new sessionId in the store but doesn't update the followupRun references used by the subsequent agent turn, plus multiple previously-flagged issues that remain unaddressed (escalation share not applied, runtime clobbering, type safety bypasses).
  • Pay close attention to src/auto-reply/reply/agent-runner.ts (stale session references after force compaction) and src/agents/pi-embedded-runner/run.ts (escalation share computed but not applied in first branch).

Last reviewed commit: 90b76b1

Improves compaction and memory flush robustness for sessions near context limits:

- Progressive escalation: overflow compaction retries get increasingly aggressive
  (maxHistoryShare: 0.4 → 0.25 → 0.1) instead of repeating same params
- Headroom pre-check: estimates token usage before flush; skips LLM call if
  insufficient headroom
- Mechanical fallback: when LLM flush fails (context overflow), falls back to
  non-LLM flush that preserves tracking metadata
- Force compaction: flush now triggers immediate compaction to prevent messages
  between flush and compaction
- Structural fallback summary: failed compaction produces useful structural overview
  (role counts, tools used) instead of generic message
- Oversized chunk handling: compaction pre-filters chunks that would exceed model
  context, preventing summarization failures
@openclaw-barnacle openclaw-barnacle bot added agents Agent runtime and tooling size: M labels Feb 16, 2026
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 90b76b118c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +225 to +229
sessionId: nextSessionId,
totalTokens: 0,
totalTokensFresh: true,
compactionCount: nextCompactionCount,
}),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Sync active run with forced session reset

This reset path only patches the persisted session entry, but it does not rotate sessionFile or update followupRun.run.sessionId/sessionFile; runAgentTurnWithFallback immediately below still executes against the old transcript for the current turn. In the forceCompaction scenarios (no flush headroom or flush overflow), the code therefore continues with the pre-reset context and can hit the same overflow again instead of actually starting from a fresh session.

Useful? React with 👍 / 👎.

Comment on lines +543 to +545
setCompactionSafeguardRuntime(sessionManager, {
maxHistoryShare: params.maxHistoryShareOverride,
});

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve context window when overriding safeguard runtime

This call overwrites the compaction safeguard runtime with only maxHistoryShare. Since the runtime registry set is replace-not-merge, it drops contextWindowTokens that was set earlier in buildEmbeddedExtensionPaths, so overflow retries fall back to the model default window. When users configure a smaller effective context window, retry compactions can under-prune history and keep failing near the limit.

Useful? React with 👍 / 👎.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

7 files reviewed, 4 comments

Edit Code Review Agent Settings | Greptile

Comment on lines 549 to 556
overflowCompactionAttempts++;
// Progressive escalation: each retry gets more aggressive
// Attempt 1: 0.4, Attempt 2: 0.25, Attempt 3: 0.1
const escalationShare = Math.max(0.1, 0.4 - (overflowCompactionAttempts - 1) * 0.15);
log.warn(
`context overflow persisted after in-attempt compaction (attempt ${overflowCompactionAttempts}/${MAX_OVERFLOW_COMPACTION_ATTEMPTS}); retrying prompt without additional compaction for ${provider}/${modelId}`,
`context overflow persisted after in-attempt compaction (attempt ${overflowCompactionAttempts}/${MAX_OVERFLOW_COMPACTION_ATTEMPTS}, maxHistoryShare=${escalationShare}); retrying prompt without additional compaction for ${provider}/${modelId}`,
);
continue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Escalation share computed but never applied

In this branch (where hadAttemptLevelCompaction is true), escalationShare is computed and logged but never passed to any compaction call. The continue retries the prompt, and the next iteration's SDK auto-compaction won't use the escalated maxHistoryShare because there's no mechanism to forward it. In contrast, the second branch (line 604) correctly passes maxHistoryShareOverride: escalationShare to compactEmbeddedPiSessionDirect.

This means the progressive escalation (0.4 → 0.25 → 0.1) only takes effect when the overflow triggers an explicit compaction (second branch), but not when it falls through to a retry with SDK auto-compaction (this branch). The log messages will misleadingly show escalating values that aren't being applied.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/run.ts
Line: 549:556

Comment:
**Escalation share computed but never applied**

In this branch (where `hadAttemptLevelCompaction` is true), `escalationShare` is computed and logged but never passed to any compaction call. The `continue` retries the prompt, and the next iteration's SDK auto-compaction won't use the escalated `maxHistoryShare` because there's no mechanism to forward it. In contrast, the second branch (line 604) correctly passes `maxHistoryShareOverride: escalationShare` to `compactEmbeddedPiSessionDirect`.

This means the progressive escalation (0.4 → 0.25 → 0.1) only takes effect when the overflow triggers an *explicit* compaction (second branch), but not when it falls through to a retry with SDK auto-compaction (this branch). The log messages will misleadingly show escalating values that aren't being applied.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +119 to +123
// Return a flag or special session entry that triggers compaction
return {
...activeSessionEntry,
forceCompaction: true, // Custom flag we'll handle in agent-runner.ts
} as any;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

forceCompaction flag bypasses type safety

Using as any to smuggle a forceCompaction boolean onto SessionEntry means this flag has no type checking at either the producer (here) or the consumer (agent-runner.ts:215). If the SessionEntry type changes or someone renames this flag, there's no compile-time safety net. Consider adding forceCompaction?: boolean to the SessionEntry type, or returning a discriminated union / wrapper type from runMemoryFlushIfNeeded instead of casting.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/agent-runner-memory.ts
Line: 119:123

Comment:
**`forceCompaction` flag bypasses type safety**

Using `as any` to smuggle a `forceCompaction` boolean onto `SessionEntry` means this flag has no type checking at either the producer (here) or the consumer (`agent-runner.ts:215`). If the `SessionEntry` type changes or someone renames this flag, there's no compile-time safety net. Consider adding `forceCompaction?: boolean` to the `SessionEntry` type, or returning a discriminated union / wrapper type from `runMemoryFlushIfNeeded` instead of casting.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +13 to +14
import { DEFAULT_MODEL, DEFAULT_PROVIDER } from "../../agents/defaults.js";
import { resolveOpenClawAgentDir } from "../../agents/agent-paths.js";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused imports

DEFAULT_MODEL, DEFAULT_PROVIDER, and resolveOpenClawAgentDir are imported but never used in this file. These will likely cause lint errors.

Suggested change
import { DEFAULT_MODEL, DEFAULT_PROVIDER } from "../../agents/defaults.js";
import { resolveOpenClawAgentDir } from "../../agents/agent-paths.js";
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/agent-runner.ts
Line: 13:14

Comment:
**Unused imports**

`DEFAULT_MODEL`, `DEFAULT_PROVIDER`, and `resolveOpenClawAgentDir` are imported but never used in this file. These will likely cause lint errors.

```suggestion
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +146 to +149
fallbacksOverride: resolveAgentModelFallbacksOverride(
params.followupRun.run.config,
resolveAgentIdFromSessionKey(params.followupRun.run.sessionKey),
),
params.followupRun.run.config,
resolveAgentIdFromSessionKey(params.followupRun.run.sessionKey),
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation regression

The arguments to resolveAgentModelFallbacksOverride are now over-indented (12 spaces instead of the original 8). The pre-PR code had correct alignment:

Suggested change
fallbacksOverride: resolveAgentModelFallbacksOverride(
params.followupRun.run.config,
resolveAgentIdFromSessionKey(params.followupRun.run.sessionKey),
),
params.followupRun.run.config,
resolveAgentIdFromSessionKey(params.followupRun.run.sessionKey),
),
params.followupRun.run.config,
resolveAgentIdFromSessionKey(params.followupRun.run.sessionKey),
),

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/agent-runner-memory.ts
Line: 146:149

Comment:
**Indentation regression**

The arguments to `resolveAgentModelFallbacksOverride` are now over-indented (12 spaces instead of the original 8). The pre-PR code had correct alignment:

```suggestion
        params.followupRun.run.config,
        resolveAgentIdFromSessionKey(params.followupRun.run.sessionKey),
      ),
```

<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

7 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +542 to +546
if (typeof params.maxHistoryShareOverride === "number") {
setCompactionSafeguardRuntime(sessionManager, {
maxHistoryShare: params.maxHistoryShareOverride,
});
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Override clobbers contextWindowTokens runtime value

setCompactionSafeguardRuntime replaces the entire registry entry (it calls WeakMap.set, not a merge). buildEmbeddedExtensionPaths (line 533) already set { maxHistoryShare, contextWindowTokens } on this sessionManager. This call replaces it with { maxHistoryShare: params.maxHistoryShareOverride }, dropping contextWindowTokens.

In compaction-safeguard.ts:200, the consumer falls back to modelContextWindow when runtime?.contextWindowTokens is missing, so this won't crash — but any config-level context window override (compaction.contextWindowTokens) is silently lost during progressive escalation.

Fix: merge with the existing runtime value instead of replacing it:

Suggested change
if (typeof params.maxHistoryShareOverride === "number") {
setCompactionSafeguardRuntime(sessionManager, {
maxHistoryShare: params.maxHistoryShareOverride,
});
}
if (typeof params.maxHistoryShareOverride === "number") {
const existing = getCompactionSafeguardRuntime(sessionManager);
setCompactionSafeguardRuntime(sessionManager, {
...existing,
maxHistoryShare: params.maxHistoryShareOverride,
});
}

This requires importing getCompactionSafeguardRuntime alongside setCompactionSafeguardRuntime.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/compact.ts
Line: 542:546

Comment:
**Override clobbers `contextWindowTokens` runtime value**

`setCompactionSafeguardRuntime` replaces the entire registry entry (it calls `WeakMap.set`, not a merge). `buildEmbeddedExtensionPaths` (line 533) already set `{ maxHistoryShare, contextWindowTokens }` on this `sessionManager`. This call replaces it with `{ maxHistoryShare: params.maxHistoryShareOverride }`, dropping `contextWindowTokens`.

In `compaction-safeguard.ts:200`, the consumer falls back to `modelContextWindow` when `runtime?.contextWindowTokens` is missing, so this won't crash — but any config-level context window override (`compaction.contextWindowTokens`) is silently lost during progressive escalation.

Fix: merge with the existing runtime value instead of replacing it:

```suggestion
      if (typeof params.maxHistoryShareOverride === "number") {
        const existing = getCompactionSafeguardRuntime(sessionManager);
        setCompactionSafeguardRuntime(sessionManager, {
          ...existing,
          maxHistoryShare: params.maxHistoryShareOverride,
        });
      }
```

This requires importing `getCompactionSafeguardRuntime` alongside `setCompactionSafeguardRuntime`.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +34 to +46
try {
logVerbose(
`mechanical flush: trimming old turns for session ${params.sessionKey ?? "(unknown)"}`,
);
// The mechanical flush is intentionally a no-op placeholder for now.
// The real value is that it marks memoryFlushAt so the system moves on
// to compaction (which does the actual size reduction). A future
// iteration can implement actual turn trimming here.
return { success: true };
} catch (err) {
const msg = err instanceof Error ? err.message : String(err);
return { success: false, error: msg };
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misleading JSDoc for a no-op function

The JSDoc says "Perform a mechanical (non-LLM) flush by trimming old messages" and "we simply drop the oldest assistant/user turns", but the function body is a no-op that always returns { success: true }. The internal comment explains this, but callers like runMechanicalFlushWithTracking log "mechanical flush fallback completed" which implies real work was done. The catch block (lines 43-45) is also unreachable since logVerbose won't throw.

Consider updating the doc comment to match the actual behavior (placeholder/stub), or annotating the return type to indicate no messages were actually trimmed.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/mechanical-flush.ts
Line: 34:46

Comment:
**Misleading JSDoc for a no-op function**

The JSDoc says "Perform a mechanical (non-LLM) flush by trimming old messages" and "we simply drop the oldest assistant/user turns", but the function body is a no-op that always returns `{ success: true }`. The internal comment explains this, but callers like `runMechanicalFlushWithTracking` log "mechanical flush fallback completed" which implies real work was done. The `catch` block (lines 43-45) is also unreachable since `logVerbose` won't throw.

Consider updating the doc comment to match the actual behavior (placeholder/stub), or annotating the return type to indicate no messages were actually trimmed.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

7 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +215 to +237
if ((activeSessionEntry as any)?.forceCompaction && sessionKey && storePath) {
logVerbose(`memory flush: force compaction flagged, resetting session context`);
const nextSessionId = crypto.randomUUID();
const nextCompactionCount = (activeSessionEntry?.compactionCount ?? 0) + 1;
delete (activeSessionEntry as any).forceCompaction;
try {
const updatedEntry = await updateSessionStoreEntry({
storePath,
sessionKey,
update: async () => ({
sessionId: nextSessionId,
totalTokens: 0,
totalTokensFresh: true,
compactionCount: nextCompactionCount,
}),
});
if (updatedEntry) {
activeSessionEntry = updatedEntry;
}
} catch (err) {
logVerbose(`force compaction session reset failed: ${String(err)}`);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Force compaction doesn't update followupRun session references

This block creates a new sessionId and persists it to the store, but never updates followupRun.run.sessionId or followupRun.run.sessionFile. The subsequent runAgentTurnWithFallback call at line 332 passes followupRun, so the agent turn will use the old session ID and file — causing a mismatch between the store (new session) and the actual run (old session).

Compare with the existing resetSession helper (lines 295-296) which correctly updates both:

followupRun.run.sessionId = nextSessionId;
followupRun.run.sessionFile = nextSessionFile;

This block also doesn't resolve a proper sessionFile path (via resolveSessionTranscriptPath) or update activeSessionStore[sessionKey], which resetSession also does. Consider reusing resetSession or at minimum mirroring its updates to followupRun.run.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/agent-runner.ts
Line: 215:237

Comment:
**Force compaction doesn't update `followupRun` session references**

This block creates a new `sessionId` and persists it to the store, but never updates `followupRun.run.sessionId` or `followupRun.run.sessionFile`. The subsequent `runAgentTurnWithFallback` call at line 332 passes `followupRun`, so the agent turn will use the **old** session ID and file — causing a mismatch between the store (new session) and the actual run (old session).

Compare with the existing `resetSession` helper (lines 295-296) which correctly updates both:
```ts
followupRun.run.sessionId = nextSessionId;
followupRun.run.sessionFile = nextSessionFile;
```

This block also doesn't resolve a proper `sessionFile` path (via `resolveSessionTranscriptPath`) or update `activeSessionStore[sessionKey]`, which `resetSession` also does. Consider reusing `resetSession` or at minimum mirroring its updates to `followupRun.run`.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +168 to +169
const chunkTokens = estimateMessagesTokens(chunk) * SAFETY_MARGIN;
if (chunkTokens > effectiveContextWindow * 0.9) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double safety margin applied to chunk token estimate

estimateMessagesTokens already applies SAFETY_MARGIN internally (line 18-19 strips details, then line 19 sums estimateTokens per message). Then this line multiplies by SAFETY_MARGIN again:

const chunkTokens = estimateMessagesTokens(chunk) * SAFETY_MARGIN;

This inflates the estimate by ~44% (1.2 × 1.2) rather than the intended 20%. Check whether the outer * SAFETY_MARGIN is intentional or whether estimateMessagesTokens already accounts for the buffer. If double-application is intentional (as a conservative safeguard), a comment would help clarify.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/compaction.ts
Line: 168:169

Comment:
**Double safety margin applied to chunk token estimate**

`estimateMessagesTokens` already applies `SAFETY_MARGIN` internally (line 18-19 strips details, then line 19 sums `estimateTokens` per message). Then this line multiplies by `SAFETY_MARGIN` again:

```ts
const chunkTokens = estimateMessagesTokens(chunk) * SAFETY_MARGIN;
```

This inflates the estimate by ~44% (1.2 × 1.2) rather than the intended 20%. Check whether the outer `* SAFETY_MARGIN` is intentional or whether `estimateMessagesTokens` already accounts for the buffer. If double-application is intentional (as a conservative safeguard), a comment would help clarify.

How can I resolve this? If you propose a fix, please make it concise.

- P1: sync followupRun.sessionId/sessionFile after forceCompaction reset
  so the main turn uses the fresh session instead of old transcript
- P2: merge safeguard runtime (preserve contextWindowTokens) instead of
  replacing when setting maxHistoryShare override
- Fix escalation in hadAttemptLevelCompaction branch: run explicit
  compaction with escalated share instead of continue with no effect
- Add forceCompaction to SessionEntry type, remove 'as any' casts
- Remove unused imports (DEFAULT_MODEL, DEFAULT_PROVIDER, resolveOpenClawAgentDir)
- Fix indentation regression in agent-runner-memory.ts
@openclaw-barnacle
Copy link

This pull request has been automatically marked as stale due to inactivity.
Please add updates or it will be closed.

@openclaw-barnacle openclaw-barnacle bot added stale Marked as stale due to inactivity and removed stale Marked as stale due to inactivity labels Feb 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant