Skip to content

feat(compaction): disable thinking by default; add compaction.thinking override#33296

Closed
iamcobolt wants to merge 11 commits intoopenclaw:mainfrom
iamcobolt:main
Closed

feat(compaction): disable thinking by default; add compaction.thinking override#33296
iamcobolt wants to merge 11 commits intoopenclaw:mainfrom
iamcobolt:main

Conversation

@iamcobolt
Copy link
Copy Markdown

@iamcobolt iamcobolt commented Mar 3, 2026

AI-assisted — drafted and iterated with Claude Code. Lightly tested: unit tests pass, pnpm tsgo clean, pnpm check passes; retry path validated by code review but not exercised against a live timeout. I understand what this code does.

Summary

Part of the compaction override work proposed in #33005.

  • Problem: compaction runs on the session's active thinking level, so sessions using slow/high-think models (e.g. Opus + thinking:high) can take 4+ minutes to compact — exceeding channel timeout windows (Discord 30s, Telegram 240s) and causing a stale-response retry loop.
  • Why it matters: affects any user running a thinking-enabled model on a chat surface with a strict reply timeout; the loop is deterministic and requires manual intervention to break (Compaction timeout races against channel timeout, causing stale-response loop #25272).
  • What changed: compaction now runs with thinking=off by default regardless of session model. New agents.defaults.compaction.thinking config key — "off" (default) or "on" (inherit the session model's current thinking level). When a compaction run times out with thinking enabled, the runner retries once without thinking automatically. The before_compaction hook is guarded to fire exactly once per compaction event.
  • What did NOT change: primary session model and thinking level are unchanged; compaction prompt content, threshold logic, and retry limits in run.ts are unchanged; no new network calls or auth paths.

Schema

agents.defaults.compaction.thinking: "off" (default) | "on"

  • "off": compaction always runs without extended thinking regardless of the session model — preventing timeout races on channels with strict reply windows.
  • "on": compaction inherits the session model's current thinking level. On timeout, automatically retries once without thinking.

Renders as a two-button segmented control in the config UI.

UI — Before / After

Before — 7-option dropdown (off · minimal · low · medium · high · xhigh · adaptive):

before

After — clean 2-button segmented control:

after

Companion PR

#33396 — adds compaction.fallbackModel to retry compaction with a cheaper/different model on quota or rate-limit errors. Can be merged in either order.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

session model's thinking level. On channels with strict reply windows
(Discord 30s, Telegram 240s), extended thinking during compaction can
race against the channel timeout and trigger a stale-response loop
(see openclaw#25272). Disabling thinking for the summarization turn eliminates
this race without affecting the primary conversation model.

New config key `agents.defaults.compaction.thinking` allows opting in:

  {
    "agents": {
      "defaults": {
        "compaction": {
          "thinking": "low"
        }
      }
    }
  }

When thinking is enabled and a compaction run times out, the runner
retries once without thinking automatically. The before_compaction hook
is guarded to fire exactly once per compaction event (not once per
attempt), keeping plugin integrations consistent.
@openclaw-barnacle openclaw-barnacle Bot added docs Improvements or additions to documentation gateway Gateway runtime agents Agent runtime and tooling size: M labels Mar 3, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 97643aa029

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

allowedToolNames,
config: params.config,
// Thinking retry loop: on timeout with thinking enabled, retry once without thinking.
for (let thinkAttempt = 0; thinkAttempt <= 1; thinkAttempt++) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Increase lock hold budget for two-attempt compaction

The new thinking retry loop keeps a single session lock across up to two compaction attempts, but the lock budget is still derived from one timeout window (resolveSessionLockMaxHoldFromTimeout in src/agents/session-write-lock.ts:111-123), which is about 420s for a 300s compaction timeout. In the timeout path this can run ~300s on attempt 1 plus a long fallback attempt, and once total hold exceeds the watchdog limit (src/agents/session-write-lock.ts:193-207) the lock is force-released while compaction is still running, allowing concurrent writers on the same session file.

Useful? React with 👍 / 👎.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 3, 2026

Greptile Summary

This PR introduces agents.defaults.compaction.thinking ("off" | "on") to decouple compaction's thinking level from the session model. Compaction now runs with thinking=off by default, preventing timeout-induced retry loops on channels with strict reply windows (Discord 30s, Telegram 240s). When thinking: "on" is configured and a compaction run times out, the runner automatically retries once with thinking=off.

Key changes:

  • New resolveCompactionThinkLevel helper in compaction-overrides.ts — clean, well-tested (5 branches covered), and correctly falls back to "off" when no session thinking level is available.
  • compactEmbeddedPiSessionDirect wraps the session creation/compact/dispose block in a for (thinkAttempt = 0; thinkAttempt <= 1; thinkAttempt++) retry loop; the timeout-retry gate at the catch block correctly checks both thinkAttempt === 0 and effectiveThinkLevel !== "off" before retrying.
  • before_compaction hook is guarded with thinkAttempt === 0 — fires exactly once per compaction event regardless of retries. after_compaction correctly fires only on success (it is unreachable from the catch/throw path), so both hooks fire at most once. ✓
  • isTimeoutError (from failover-error.ts) correctly matches the "Compaction timed out" message emitted by withTimeout, so the retry gate will fire as intended.
  • Sessions are properly disposed in finally on every loop iteration, preventing leaks.
  • Zod schema, TypeScript type, config labels, and docs reference are all consistently updated.
  • Diagnostic logs inside the loop now include thinking= and thinkAttempt= fields, addressing prior feedback.

Minor pre-existing observations (already flagged in previous threads — not repeated here):

  • Unrelated changelog entries bundled in CHANGELOG.md.
  • Stale forceSegmented mention in the config-form.node.ts comment.

Confidence Score: 4/5

  • This PR is safe to merge; logic is sound and all session/hook lifecycle paths are correctly handled.
  • The thinking-retry loop is correctly structured, isTimeoutError matches the right error, hooks fire exactly once, and sessions are disposed on every path. The only gap is that the live retry path (timeout → retry without thinking) has not been exercised against a real model timeout, as the author acknowledges. Unit tests cover resolveCompactionThinkLevel exhaustively. No regressions introduced to the default (thinking=off) compaction path.
  • No files require special attention; src/agents/pi-embedded-runner/compact.ts is the most complex change but has been reviewed thoroughly.

Last reviewed commit: 8fa5176

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: baef98faec

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +675 to 677
const result = await compactWithSafetyTimeout(() =>
session.compact(params.customInstructions),
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Cancel timed-out compaction before starting retry

When a thinking-enabled compaction exceeds the timeout, compactWithSafetyTimeout rejects but does not actually stop session.compact (it uses a timeout race in src/node-host/with-timeout.ts and this callsite does not pass/use the abort signal), and the new retry loop immediately starts a second compaction attempt. In that timeout path, two compactions can overlap on the same session state/file, which can cause duplicate provider work and interleaved session mutations; the retry should only begin after the first attempt is truly cancelled or fully settled.

Useful? React with 👍 / 👎.

@iamcobolt
Copy link
Copy Markdown
Author

Tracked in discussion #33430 — split out from #33005, which originally bundled this with a compaction.model key. That approach was set aside; this PR delivers the thinking-level piece as a focused, independent change.

@iamcobolt
Copy link
Copy Markdown
Author

The secrets check failure is pre-existing and unrelated to this PR — it also fails on PR #33396 and appears to affect all open PRs.

Root cause: GHSA-8mvx-p2r9-r375extensions/googlechat lists openclaw as a dependency pinned to <=2026.3.1; the patched version is >=2026.3.2. This PR does not touch extensions/googlechat, package.json, or any dependency. The detect-secrets and detect-private-key steps both passed.

iamcobolt added a commit to iamcobolt/openclaw that referenced this pull request Mar 3, 2026
@openclaw-barnacle openclaw-barnacle Bot added the app: web-ui App: web-ui label Mar 4, 2026
@iamcobolt iamcobolt marked this pull request as draft March 4, 2026 00:15
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 657f3226bf

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/shared/config-ui-hints-types.ts Outdated
placeholder?: string;
itemTemplate?: unknown;
/** Force segmented button rendering even when the option count exceeds the default threshold. */
forceSegmented?: boolean;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Add new uiHint key to protocol schema

forceSegmented is now emitted in config UI hints, but the gateway protocol contract still defines ConfigUiHintSchema with additionalProperties: false and no forceSegmented field (src/gateway/protocol/schema/config.ts:40-53). Any client that validates config.schema responses against that schema will reject responses containing this new hint key, so the protocol schema needs to be updated alongside this type addition.

Useful? React with 👍 / 👎.

@iamcobolt iamcobolt marked this pull request as ready for review March 4, 2026 01:39
}

// Enum - use segmented for small, dropdown for large
// Enum - use segmented for small (or forceSegmented), dropdown for large
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description states that forceSegmented hint machinery was removed (no longer needed with only 2 options). This comment should be updated to reflect the current state and avoid confusing future contributors.

Suggested change
// Enum - use segmented for small (or forceSegmented), dropdown for large
// Enum - use segmented for small, dropdown for large
Prompt To Fix With AI
This is a comment left during a code review.
Path: ui/src/ui/views/config-form.node.ts
Line: 445

Comment:
The PR description states that `forceSegmented` hint machinery was removed (no longer needed with only 2 options). This comment should be updated to reflect the current state and avoid confusing future contributors.

```suggestion
  // Enum - use segmented for small, dropdown for large
```

How can I resolve this? If you propose a fix, please make it concise.

@iamcobolt
Copy link
Copy Markdown
Author

Latest push — docs wording update only (no logic changes):

  • CHANGELOG.md: updated thinking bullet to say Set agents.defaults.compaction.thinking: "on" rather than the old opt-in example with "low", matching the simplified "off" | "on" schema.
  • docs/gateway/configuration-reference.md: updated the inline code comment (off | on instead of the old ThinkLevel enum) and the prose description to use "on" wording consistently.

@iamcobolt
Copy link
Copy Markdown
Author

@greptile-apps review

Comment thread CHANGELOG.md
Comment on lines +9 to +12
- Agents/Compaction: compaction now runs with `thinking: off` by default, regardless of the session model's thinking level — preventing timeout races on channels with strict reply windows (Discord 30s, Telegram 240s). Set `agents.defaults.compaction.thinking: "on"` to inherit the session model's current thinking level. When a compaction run times out with thinking enabled, it retries once without thinking automatically.
- CLI/Config validation: add `openclaw config validate` (with `--json`) to validate config files before gateway startup, and include detailed invalid-key paths in startup invalid-config errors. (#31220) thanks @Sid-Qin.
- Sessions/Attachments: add inline file attachment support for `sessions_spawn` (subagent runtime only) with base64/utf8 encoding, transcript content redaction, lifecycle cleanup, and configurable limits via `tools.sessions_spawn.attachments`. (#16761) Thanks @napetrov.
- Agents/Thinking defaults: set `adaptive` as the default thinking level for Anthropic Claude 4.6 models (including Bedrock Claude 4.6 refs) while keeping other reasoning-capable models at `low` unless explicitly configured.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated changelog entries included

This PR adds four new entries to CHANGELOG.md, but only the first one (Agents/Compaction: compaction now runs with thinking: off by default…) is attributable to this PR's changes. The other three entries reference separate issues and contributors entirely unrelated to compaction thinking:

  • CLI/Config validation: add openclaw config validate … (#31220) thanks @Sid-Qin — unrelated
  • Sessions/Attachments: add inline file attachment support … (#16761) Thanks @napetrov — unrelated
  • Agents/Thinking defaults: set adaptive as the default thinking level… — unrelated

These entries appear to have been accidentally picked up from another branch or staging area. If left in, they'll be misattributed to this PR in the release history and could conflict with the changelog entries in their respective PRs when those merge.

Prompt To Fix With AI
This is a comment left during a code review.
Path: CHANGELOG.md
Line: 9-12

Comment:
**Unrelated changelog entries included**

This PR adds four new entries to `CHANGELOG.md`, but only the first one (`Agents/Compaction: compaction now runs with thinking: off by default…`) is attributable to this PR's changes. The other three entries reference separate issues and contributors entirely unrelated to compaction thinking:

- `CLI/Config validation: add openclaw config validate … (#31220) thanks @Sid-Qin` — unrelated
- `Sessions/Attachments: add inline file attachment support … (#16761) Thanks @napetrov` — unrelated
- `Agents/Thinking defaults: set adaptive as the default thinking level…` — unrelated

These entries appear to have been accidentally picked up from another branch or staging area. If left in, they'll be misattributed to this PR in the release history and could conflict with the changelog entries in their respective PRs when those merge.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +664 to +671
if (diagEnabled && preMetrics) {
log.debug(
`[compaction-diag] start runId=${runId} sessionKey=${params.sessionKey ?? params.sessionId} ` +
`diagId=${diagId} trigger=${trigger} provider=${provider}/${modelId} ` +
`attempt=${attempt} maxAttempts=${maxAttempts} ` +
`pre.messages=${preMetrics.messages} pre.historyTextChars=${preMetrics.historyTextChars} ` +
`pre.toolResultChars=${preMetrics.toolResultChars} pre.estTokens=${preMetrics.estTokens ?? "unknown"}`,
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing thinkAttempt/effectiveThinkLevel in diagnostic log

The [compaction-diag] start log line logs attempt, maxAttempts, provider, model, etc., but does not include the current effectiveThinkLevel or thinkAttempt. When debugging a timeout-and-retry sequence in production logs, it will be difficult to correlate which thinking level was active for each log line.

Consider adding thinkLevel=${effectiveThinkLevel} thinkAttempt=${thinkAttempt} to this log statement (and the end log below) to make post-mortem analysis easier.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/compact.ts
Line: 664-671

Comment:
**Missing `thinkAttempt`/`effectiveThinkLevel` in diagnostic log**

The `[compaction-diag] start` log line logs `attempt`, `maxAttempts`, provider, model, etc., but does not include the current `effectiveThinkLevel` or `thinkAttempt`. When debugging a timeout-and-retry sequence in production logs, it will be difficult to correlate which thinking level was active for each log line.

Consider adding `thinkLevel=${effectiveThinkLevel} thinkAttempt=${thinkAttempt}` to this log statement (and the `end` log below) to make post-mortem analysis easier.

How can I resolve this? If you propose a fix, please make it concise.

@iamcobolt
Copy link
Copy Markdown
Author

Addressed both Greptile points — pushed in commit 8fa5176:

[compaction-diag] log fieldsthinking=${effectiveThinkLevel} thinkAttempt=${thinkAttempt} added to both the start and end log lines. These will now appear in production debug logs, making it straightforward to trace which thinking level was active and which attempt a timeout-retry occurred on.

CHANGELOG entries — the three extra entries (CLI config validate, Sessions/Attachments, Agents/Thinking defaults) are a fork-accumulation artifact: other work landed on iamcobolt/openclaw:main alongside this PR's commits. They belong to separate PRs and won't be misattributed once the upstream merge resolves the diff — the CHANGELOG section for this PR's changes is accurate.

@greptile-apps review

@iamcobolt iamcobolt marked this pull request as draft March 4, 2026 09:42
iamcobolt added a commit to iamcobolt/openclaw that referenced this pull request Mar 4, 2026
…-limit errors

Adds agents.defaults.compaction.fallbackModel — an opt-in key that retries
compaction with a different model when the primary fails due to quota or
rate-limit errors (HTTP 429 / 402 billing). Auth errors and timeouts use
their own retry mechanisms and are unaffected.

"off" (default): no change in behaviour
"fallback": tries each model in agents.defaults.model.fallbacks in order,
            skipping the current model; re-throws after exhausting candidates

Also ships the companion thinking-override change (compaction-overrides.ts)
needed for compact.ts's inner thinking-retry loop. See companion PR openclaw#33296.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@iamcobolt
Copy link
Copy Markdown
Author

Superseded by #34554 — reissued as a clean single commit on a proper feature branch (feat/compaction-thinking). Old PR had accumulated noisy CHANGELOG entries from the fork and an awkward head branch name (iamcobolt:main). New PR has a clean description, correct before/after screenshots (upstream main baseline), and the AI-disclosure statement.

@iamcobolt iamcobolt closed this Mar 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling app: web-ui App: web-ui docs Improvements or additions to documentation gateway Gateway runtime size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Compaction timeout races against channel timeout, causing stale-response loop

1 participant