Skip to content

feat(compaction): add fallbackModel to retry compaction on quota/rate-limit errors#33396

Closed
iamcobolt wants to merge 16 commits intoopenclaw:mainfrom
iamcobolt:feat/compaction-fallback-model
Closed

feat(compaction): add fallbackModel to retry compaction on quota/rate-limit errors#33396
iamcobolt wants to merge 16 commits intoopenclaw:mainfrom
iamcobolt:feat/compaction-fallback-model

Conversation

@iamcobolt
Copy link
Copy Markdown

@iamcobolt iamcobolt commented Mar 3, 2026

AI-assisted — drafted and iterated with Claude Code. Lightly tested: unit tests pass (pnpm tsgo clean, pnpm check passes, 8 unit tests green); fallback retry path validated by code review and Greptile; not exercised against live quota exhaustion. I understand what this code does.

Summary

Companion to #33296, implementing the second half of the compaction override work proposed in #33005.

Adds agents.defaults.compaction.fallbackModel — an opt-in knob that retries compaction with a different model when the primary fails due to quota or rate-limit errors (HTTP 429 / 402 billing). Auth errors and timeouts use their own retry mechanisms and are unaffected.

Two states:

  • "off" (default) — no change in behaviour
  • "fallback" — tries each model in the agents.defaults.model.fallbacks chain in order, skipping the current model

New file compaction-fallback.ts + 8 unit tests. Can be merged before or after #33296.

Schema

agents.defaults.compaction.fallbackModel: "off" (default) | "fallback"

Renders as a two-button segmented control in the config UI, with a help tooltip explaining the retry semantics.

UI — Before / After

Before — 3-button composite with conditional free-text input (off · fallback · custom + text field when custom is selected):

before

After — clean 2-button segmented control with help tooltip:

after

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

Key Changes

  • compaction-fallback.ts (new) — resolveCompactionFallbackCandidates() resolves the ordered candidate list from config; filters out the current model to avoid a no-op retry
  • compaction-fallback.test.ts (new) — 8 unit tests covering all resolveCompactionFallbackCandidates states
  • compact.ts — outer model-fallback loop wrapping the thinking-retry inner loop; transcriptPolicy and sanitizeSessionHistory re-resolved per candidate (provider-specific flags correct for cross-provider fallbacks); before_compaction hook guarded by hookFired (fires exactly once per compaction event); rebuilds extension factories + resource loader per candidate
  • compaction-overrides.tsresolveCompactionThinkLevel updated to accept sessionThinkLevel param; "on" inherits the session model's level at call time
  • types.agent-defaults.tsfallbackModel?: "off" | "fallback"; thinking?: "off" | "on"
  • zod-schema.agent-defaults.ts — both fields narrowed to their 2-option union with .describe()
  • schema.help.ts — help tooltip for fallbackModel
  • schema.labels.ts — UI labels for thinking and fallbackModel
  • docs/gateway/configuration-reference.md + CHANGELOG.md — updated

Notes for Reviewers

  • The outer loop breaks on rate_limit/billing errors with remaining candidates and falls through to the next; after exhausting all candidates the last quota error is re-thrown.
  • before_compaction hook: guarded by hookFired so it fires exactly once per compaction event regardless of model or thinking retry depth.
  • transcriptPolicy and sanitizeSessionHistory re-resolved per candidate inside the outer loop — provider-specific flags (validateGeminiTurns, validateAnthropicTurns, etc.) are correct for cross-provider fallbacks. (Addresses Greptile's transcriptPolicy review comment; resolved in follow-up commit fcc5f17.)

Security Impact

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No — uses existing auth resolution per candidate
  • New/changed network calls? Only when a fallback is triggered (same API surface as primary)
  • Command/tool execution surface changed? No
  • Data access scope changed? No

Repro + Verification

Steps

  1. Set agents.defaults.compaction.fallbackModel: "fallback" with a fallbacks chain in config
  2. Exhaust primary model quota or simulate a 429 response
  3. Trigger compaction

Expected

Compaction retries on the first available fallback model and completes successfully.

Evidence

  • 8 unit tests covering all resolveCompactionFallbackCandidates cases
  • Live quota-exhaustion trace

Human Verification

  • Verified: unit tests pass, pnpm tsgo clean, pnpm check passes
  • Edge cases: absent config, "off", "fallback" with empty fallbacks list, "fallback" filtering current model, hookFired guard across model and thinking retry loops

Compatibility / Migration

  • Backward compatible: fallbackModel is optional; default is "off" (no change in behaviour)
  • To disable: remove or set agents.defaults.compaction.fallbackModel: "off"

session model's thinking level. On channels with strict reply windows
(Discord 30s, Telegram 240s), extended thinking during compaction can
race against the channel timeout and trigger a stale-response loop
(see openclaw#25272). Disabling thinking for the summarization turn eliminates
this race without affecting the primary conversation model.

New config key `agents.defaults.compaction.thinking` allows opting in:

  {
    "agents": {
      "defaults": {
        "compaction": {
          "thinking": "low"
        }
      }
    }
  }

When thinking is enabled and a compaction run times out, the runner
retries once without thinking automatically. The before_compaction hook
is guarded to fire exactly once per compaction event (not once per
attempt), keeping plugin integrations consistent.
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 3, 2026

Greptile Summary

This PR adds agents.defaults.compaction.fallbackModel — an opt-in knob that retries compaction with a different model when the primary fails due to quota or rate-limit errors (HTTP 429/402). It also introduces compaction.thinking to control whether extended thinking is used during compaction (defaulting to "off" to avoid timeout races on strict-window channels like Discord/Telegram).

Key changes:

  • New compaction-fallback.ts with resolveCompactionFallbackCandidates() that resolves the ordered fallback candidate list from agents.defaults.model.fallbacks, filtering out the current model to avoid no-op retries
  • New compaction-overrides.ts with resolveCompactionThinkLevel() that returns "off" by default (or the session think level when thinking: "on" is configured)
  • compact.ts wraps the existing thinking-retry inner loop with an outer model-fallback loop; per-candidate re-resolution of transcriptPolicy, sanitizeSessionHistory, extension factories, and resource loaders is correct for cross-provider fallbacks
  • before_compaction hook is guarded by a hookFired flag so it fires exactly once per compaction event regardless of model or thinking retry depth
  • effectiveThinkLevel is correctly re-initialized at the start of each outer loop iteration (per candidate), not shared across fallbacks
  • Session disposal in the finally block is correct: session is declared before the inner try block so it is always in scope for cleanup
  • Type narrowed to "off" | "fallback" only at schema level; dead explicit-string branch and its tests have been removed
  • 11 unit tests across two new test files (compaction-fallback.test.ts — 6 tests, compaction-overrides.test.ts — 5 tests)
  • All previous review comments (stale model identity in sanitizeSessionHistory, stale model in diagnostic logs, dead explicit-string code path, missing .describe() on fallbackModel) appear addressed in follow-up commits

Confidence Score: 4/5

  • Safe to merge with no critical issues; implementation is well-structured with correct resource cleanup and targeted retry semantics.
  • The outer/inner retry loop structure is logically correct, session disposal is handled in the right place, hookFired prevents double-firing, and cross-provider sanitization is re-resolved per candidate. All previous review issues appear addressed. The one point preventing a perfect 5 is that this code path (live quota exhaustion triggering a fallback) has not been exercised in production, as noted by the author, and carries inherent complexity in the nested loop control flow.
  • No files require special attention; compact.ts has the most complex control flow but it is well-commented and the key correctness invariants hold.

Last reviewed commit: 8f68d36

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

9 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +663 to 673
const prior = await sanitizeSessionHistory({
messages: session.messages,
modelApi: model.api,
modelId,
provider,
allowedToolNames,
config: params.config,
sessionManager,
sessionId: params.sessionId,
policy: transcriptPolicy,
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sanitizeSessionHistory uses stale original model identity on fallback

model.api, modelId, and provider are closed over from the outer scope (lines 265-266, 282) and always refer to the original (primary) model. When a fallback candidate from a different provider is being used (e.g., falling back from anthropic/claude-sonnet-4-6 to openai/gpt-4o-mini), sanitizeSessionHistory still receives the original provider's API type and identifiers. Because this function applies provider-specific message sanitization logic (e.g., filtering tool-result blocks that are illegal for a given model API), using the wrong modelApi/provider/modelId can produce incorrectly sanitized messages that the fallback model then rejects or silently mishandles.

The current candidate's values should be passed instead:

Suggested change
const prior = await sanitizeSessionHistory({
messages: session.messages,
modelApi: model.api,
modelId,
provider,
allowedToolNames,
config: params.config,
sessionManager,
sessionId: params.sessionId,
policy: transcriptPolicy,
});
const prior = await sanitizeSessionHistory({
messages: session.messages,
modelApi: currentModel.api,
modelId: cand.model,
provider: cand.provider,
allowedToolNames,
config: params.config,
sessionManager,
sessionId: params.sessionId,
policy: transcriptPolicy,
});
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/compact.ts
Line: 663-673

Comment:
`sanitizeSessionHistory` uses stale original model identity on fallback

`model.api`, `modelId`, and `provider` are closed over from the outer scope (lines 265-266, 282) and always refer to the *original* (primary) model. When a fallback candidate from a different provider is being used (e.g., falling back from `anthropic/claude-sonnet-4-6` to `openai/gpt-4o-mini`), `sanitizeSessionHistory` still receives the original provider's API type and identifiers. Because this function applies provider-specific message sanitization logic (e.g., filtering tool-result blocks that are illegal for a given model API), using the wrong `modelApi`/`provider`/`modelId` can produce incorrectly sanitized messages that the fallback model then rejects or silently mishandles.

The current candidate's values should be passed instead:

```suggestion
            const prior = await sanitizeSessionHistory({
              messages: session.messages,
              modelApi: currentModel.api,
              modelId: cand.model,
              provider: cand.provider,
              allowedToolNames,
              config: params.config,
              sessionManager,
              sessionId: params.sessionId,
              policy: transcriptPolicy,
            });
```

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit fcc5f17sanitizeSessionHistory now uses currentModel.api, cand.model, and cand.provider so sanitization uses the active fallback's provider rules.

Comment on lines +729 to +735
log.debug(
`[compaction-diag] start runId=${runId} sessionKey=${params.sessionKey ?? params.sessionId} ` +
`diagId=${diagId} trigger=${trigger} provider=${provider}/${modelId} ` +
`attempt=${attempt} maxAttempts=${maxAttempts} ` +
`pre.messages=${preMetrics.messages} pre.historyTextChars=${preMetrics.historyTextChars} ` +
`pre.toolResultChars=${preMetrics.toolResultChars} pre.estTokens=${preMetrics.estTokens ?? "unknown"}`,
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Diagnostic log reports original model instead of active fallback candidate

The provider and modelId logged on line 731 are the original session values (lines 265–266) and remain unchanged when a fallback model is active. When a fallback fires (line 576–840), the log line will incorrectly attribute the compaction run to the primary model, making it difficult to correlate actual execution with logged diagnostics.

Replace provider with cand.provider and modelId with cand.model in the diagnostic log template on line 731 to reflect the model actually being used.

Note: The same issue appears on line 785 in the end diagnostic log—update that as well.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/compact.ts
Line: 729-735

Comment:
Diagnostic log reports original model instead of active fallback candidate

The `provider` and `modelId` logged on line 731 are the original session values (lines 265–266) and remain unchanged when a fallback model is active. When a fallback fires (line 576–840), the log line will incorrectly attribute the compaction run to the primary model, making it difficult to correlate actual execution with logged diagnostics.

Replace `provider` with `cand.provider` and `modelId` with `cand.model` in the diagnostic log template on line 731 to reflect the model actually being used.

Note: The same issue appears on line 785 in the end diagnostic log—update that as well.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit fcc5f17 — both [compaction-diag] start and end log lines inside the outer loop now use cand.provider/cand.model. The fail() log before the loop intentionally keeps provider/modelId since no candidate is active at that point.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 3, 2026

Additional Comments (1)

src/agents/pi-embedded-runner/compact.ts
transcriptPolicy not re-resolved for fallback model candidates

transcriptPolicy is computed once from the original model.api, provider, and modelId (lines 547–551), then reused across all iterations of the outer model-fallback loop (line 576–840). The policy controls validateGeminiTurns, validateAnthropicTurns, repairToolUseResultPairing, and other provider-specific behaviour. When a fallback candidate belongs to a different provider (e.g., the primary is anthropic and the fallback is openai), the provider-specific turn validation will never run with the correct rules — because validateGeminiTurns or validateAnthropicTurns were computed for the wrong provider. Validation mismatches can silently pass malformed message sequences to the fallback model.

transcriptPolicy should be re-derived inside the outer loop using currentModel.api, cand.provider, and cand.model — at minimum for candidateIdx > 0:

      for (let candidateIdx = 0; candidateIdx < allCandidates.length; candidateIdx++) {
        // ... [model switching code] ...
        
        // Re-resolve transcript policy per candidate
        const transcriptPolicy = resolveTranscriptPolicy({
          modelApi: currentModel.api,
          provider: cand.provider,
          modelId: cand.model,
        });
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/compact.ts
Line: 547-551

Comment:
`transcriptPolicy` not re-resolved for fallback model candidates

`transcriptPolicy` is computed once from the original `model.api`, `provider`, and `modelId` (lines 547–551), then reused across all iterations of the outer model-fallback loop (line 576–840). The policy controls `validateGeminiTurns`, `validateAnthropicTurns`, `repairToolUseResultPairing`, and other provider-specific behaviour. When a fallback candidate belongs to a different provider (e.g., the primary is `anthropic` and the fallback is `openai`), the provider-specific turn validation will never run with the correct rules — because `validateGeminiTurns` or `validateAnthropicTurns` were computed for the wrong provider. Validation mismatches can silently pass malformed message sequences to the fallback model.

`transcriptPolicy` should be re-derived inside the outer loop using `currentModel.api`, `cand.provider`, and `cand.model` — at minimum for `candidateIdx > 0`:

```suggestion
      for (let candidateIdx = 0; candidateIdx < allCandidates.length; candidateIdx++) {
        // ... [model switching code] ...
        
        // Re-resolve transcript policy per candidate
        const transcriptPolicy = resolveTranscriptPolicy({
          modelApi: currentModel.api,
          provider: cand.provider,
          modelId: cand.model,
        });
```

How can I resolve this? If you propose a fix, please make it concise.

@iamcobolt
Copy link
Copy Markdown
Author

Thanks for the thorough review — all three issues were valid.

Fixed in the follow-up commit (fcc5f17):

  1. transcriptPolicy re-resolution — now derived inside the outer loop from currentModel.api, cand.provider, and cand.model, shadowing the outer instance that remains in place for sessionManager setup (which happens before the loop). Provider-specific flags (validateGeminiTurns, validateAnthropicTurns, etc.) are now correct for each candidate.

  2. sanitizeSessionHistorymodelApi, modelId, and provider now use currentModel.api, cand.model, and cand.provider respectively, so message sanitization uses the active fallback's provider rules.

  3. Diagnostic logs — both [compaction-diag] start and [compaction-diag] end lines inside the loop now report cand.provider/cand.model. The fail() function log at line 270 (before the loop, for pre-loop errors) intentionally keeps provider/modelId since cand is not yet defined at that point.

@openclaw-barnacle openclaw-barnacle Bot added docs Improvements or additions to documentation gateway Gateway runtime agents Agent runtime and tooling size: L labels Mar 3, 2026
@iamcobolt
Copy link
Copy Markdown
Author

Tracked in discussion #33431 — split out from #33005, which originally proposed a general compaction.model override. That approach was set aside; this PR delivers targeted quota/rate-limit resilience as a focused, independent change.

@iamcobolt
Copy link
Copy Markdown
Author

@greptile-apps review

Comment on lines +56 to +62
// Explicit "provider/model" string
const resolved = resolveModelRefFromString({
raw: fallbackModel,
defaultProvider,
aliasIndex,
});
return resolved && !isCurrent(resolved.ref) ? [resolved.ref] : [];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silent failure when explicit fallbackModel string can't be resolved

If a user sets fallbackModel to a string that resolveModelRefFromString can't parse (e.g. a bare model name like "claude-haiku-4-5" that isn't in the alias index, or a typo like "anthropic/claude-haiku4-5"), resolved is null and the function silently returns []. The user gets no fallback and no diagnostic that their configuration is being ignored — they'll only notice when they hit a real quota event and nothing retries.

Consider emitting a log.warn here, or alternatively inside compact.ts after building allCandidates, when fallbackModel is configured but the candidates list ends up empty:

// Explicit "provider/model" string
const resolved = resolveModelRefFromString({
  raw: fallbackModel,
  defaultProvider,
  aliasIndex,
});
if (!resolved) {
  // Optionally: consumers should warn, but a note here aids debugging
  return [];
}
return !isCurrent(resolved.ref) ? [resolved.ref] : [];

At minimum, the test suite would benefit from an additional case — e.g. fallbackModel: "not-a-valid-model-string" — to document and pin this silent-no-op behaviour.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/compaction-fallback.ts
Line: 56-62

Comment:
**Silent failure when explicit `fallbackModel` string can't be resolved**

If a user sets `fallbackModel` to a string that `resolveModelRefFromString` can't parse (e.g. a bare model name like `"claude-haiku-4-5"` that isn't in the alias index, or a typo like `"anthropic/claude-haiku4-5"`), `resolved` is `null` and the function silently returns `[]`. The user gets no fallback and no diagnostic that their configuration is being ignored — they'll only notice when they hit a real quota event and nothing retries.

Consider emitting a `log.warn` here, or alternatively inside `compact.ts` after building `allCandidates`, when `fallbackModel` is configured but the candidates list ends up empty:

```ts
// Explicit "provider/model" string
const resolved = resolveModelRefFromString({
  raw: fallbackModel,
  defaultProvider,
  aliasIndex,
});
if (!resolved) {
  // Optionally: consumers should warn, but a note here aids debugging
  return [];
}
return !isCurrent(resolved.ref) ? [resolved.ref] : [];
```

At minimum, the test suite would benefit from an additional case — e.g. `fallbackModel: "not-a-valid-model-string"` — to document and pin this silent-no-op behaviour.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit fce6f25 — added an optional warn callback to resolveCompactionFallbackCandidates. When an explicit string can't be resolved, it calls warn() with a message explaining the expected format. The call site in compact.ts passes log.warn so the misconfiguration is visible in the gateway log at startup.

Comment thread src/config/zod-schema.agent-defaults.ts Outdated
.union([z.literal("strict"), z.literal("off"), z.literal("custom")])
.optional(),
identifierInstructions: z.string().optional(),
fallbackModel: z.string().optional(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing .describe() on fallbackModel schema field

The sibling thinking field (lines 87–100) includes a .describe(...) string that documents its semantics and allowed values. fallbackModel is validated as a plain z.string().optional() with no description, which means generated JSON Schema / config-validation output won't explain the three accepted values ("off", "fallback", or "provider/model"). For consistency and discoverability, add a .describe():

Suggested change
fallbackModel: z.string().optional(),
fallbackModel: z.string().optional().describe(
'"off" (default) disables fallback. "fallback" uses the agents.defaults.model.fallbacks chain. An explicit "provider/model" string targets a specific model.',
),
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/config/zod-schema.agent-defaults.ts
Line: 109

Comment:
**Missing `.describe()` on `fallbackModel` schema field**

The sibling `thinking` field (lines 87–100) includes a `.describe(...)` string that documents its semantics and allowed values. `fallbackModel` is validated as a plain `z.string().optional()` with no description, which means generated JSON Schema / config-validation output won't explain the three accepted values (`"off"`, `"fallback"`, or `"provider/model"`). For consistency and discoverability, add a `.describe()`:

```suggestion
        fallbackModel: z.string().optional().describe(
          '"off" (default) disables fallback. "fallback" uses the agents.defaults.model.fallbacks chain. An explicit "provider/model" string targets a specific model.',
        ),
```

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit fce6f25 — added .describe() to fallbackModel documenting the three accepted values ("off", "fallback", "provider/model") and their semantics, consistent with the sibling thinking field.

@openclaw-barnacle openclaw-barnacle Bot removed the app: web-ui App: web-ui label Mar 4, 2026
@iamcobolt iamcobolt marked this pull request as ready for review March 4, 2026 01:39
@iamcobolt
Copy link
Copy Markdown
Author

Latest push — schema simplification + UI cleanup:

  • fallbackModel narrowed to 2 options: removed the explicit "provider/model" string option. Schema is now "off" | "fallback" only. The "fallback" chain covers the common case; a permanent always-on model override is a separate concern (see #33005 discussion). Addresses the Greptile concern about unresolvable-string silent failures — moot with a strict enum.
  • thinking simplified in this branch too: compaction-overrides.ts updated so resolveCompactionThinkLevel accepts sessionThinkLevel and "on" inherits the session level at call time (mirrors feat(compaction): disable thinking by default; add compaction.thinking override #33296). zod-schema and types narrowed to "off" | "on" here as well for consistency.
  • schema.help.ts: added help tooltip for fallbackModel (addresses the missing .describe() gap noted by Greptile — now covered at both schema and UI hint levels).
  • Dead UI code removed: renderLiteralsOrTextComposite was added for the now-gone explicit string option; removed. "off" | "fallback" renders natively as a segmented control via the existing allLiterals path.
  • schema.hints.ts: removed stale "provider/model" placeholder.

@iamcobolt
Copy link
Copy Markdown
Author

@greptile-apps review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0c3fa46b39

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +597 to +601
const nextKeyInfo = await getApiKeyForModel({
model: nextModel,
cfg: params.config,
agentDir,
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve authProfileId when resolving fallback model credentials

The primary compaction model resolves credentials with profileId: params.authProfileId, but the fallback path drops profileId when calling getApiKeyForModel. In sessions that rely on an auth-profile override (for example, provider-scoped profile keys), fallback candidates can be skipped or fail auth even though valid profile credentials exist, so the new quota fallback behavior silently does not work in those runs.

Useful? React with 👍 / 👎.

Comment on lines +104 to +106
fallbackModel: z
.union([z.literal("off"), z.literal("fallback")])
.optional()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Accept explicit model refs in fallbackModel config schema

This schema only allows "off" | "fallback", but resolveCompactionFallbackCandidates includes an explicit "provider/model" branch (and tests for it), so users configuring an explicit fallback model will hit config validation errors instead of getting fallback behavior. That makes one supported fallback mode unreachable through normal validated config loading.

Useful? React with 👍 / 👎.

Comment on lines +58 to +70
// Explicit "provider/model" string
const resolved = resolveModelRefFromString({
raw: fallbackModel,
defaultProvider,
aliasIndex,
});
if (!resolved) {
params.warn?.(
`[compaction] fallbackModel "${fallbackModel}" could not be resolved — expected "off", "fallback", or a "provider/model" string; no fallback will be used`,
);
return [];
}
return !isCurrent(resolved.ref) ? [resolved.ref] : [];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The explicit "provider/model" string handling at lines 58–70 is unreachable dead code. The TypeScript type (line 298 of types.agent-defaults.ts) and Zod schema (lines 104–105 of zod-schema.agent-defaults.ts) have been narrowed to only allow "off" | "fallback". Any explicit string like "openai/gpt-4o-mini" will be rejected at config-validation time and never reach this function.

The commit f077c9b5f ("narrow fallbackModel to off/fallback") narrowed the schema but did not remove this dead code path. The function documentation (lines 12–17) still mentions returning candidates "for 'fallback' or an explicit string", which is now misleading.

Either remove this dead branch (and the corresponding tests at compaction-fallback.test.ts:76–92), or widen the type/schema to actually support explicit strings if the feature is intended for future use.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/compaction-fallback.ts
Line: 58-70

Comment:
The explicit `"provider/model"` string handling at lines 58–70 is unreachable dead code. The TypeScript type (line 298 of `types.agent-defaults.ts`) and Zod schema (lines 104–105 of `zod-schema.agent-defaults.ts`) have been narrowed to only allow `"off" | "fallback"`. Any explicit string like `"openai/gpt-4o-mini"` will be rejected at config-validation time and never reach this function.

The commit `f077c9b5f` ("narrow fallbackModel to off/fallback") narrowed the schema but did not remove this dead code path. The function documentation (lines 12–17) still mentions returning candidates "for 'fallback' or an explicit string", which is now misleading. 

Either remove this dead branch (and the corresponding tests at compaction-fallback.test.ts:76–92), or widen the type/schema to actually support explicit strings if the feature is intended for future use.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +76 to +92
it("explicit provider/model string returns that candidate", () => {
const cfg = {
agents: { defaults: { compaction: { fallbackModel: "openai/gpt-4o-mini" } } },
} as unknown as OpenClawConfig;
expect(resolveCompactionFallbackCandidates({ cfg, ...current })).toEqual([
{ provider: "openai", model: "gpt-4o-mini" },
]);
});

it("explicit string matching the current model returns []", () => {
const cfg = {
agents: {
defaults: { compaction: { fallbackModel: "anthropic/claude-sonnet-4-6" } },
},
} as unknown as OpenClawConfig;
expect(resolveCompactionFallbackCandidates({ cfg, ...current })).toEqual([]);
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These test cases exercise an explicit "provider/model" string path that cannot be reached in production due to schema validation constraints. The Zod schema only allows "off" or "fallback", rejecting any explicit string before it reaches resolveCompactionFallbackCandidates().

These tests use as unknown as OpenClawConfig to bypass the type system and inject values the schema forbids. They pass at the unit level but give false confidence — there is no production path where a user's explicit string config survives validation and executes this code.

Remove these tests if the explicit-string option is not intended to be exposed, or widen the type/schema to make the feature actually available.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/compaction-fallback.test.ts
Line: 76-92

Comment:
These test cases exercise an explicit `"provider/model"` string path that cannot be reached in production due to schema validation constraints. The Zod schema only allows `"off"` or `"fallback"`, rejecting any explicit string before it reaches `resolveCompactionFallbackCandidates()`.

These tests use `as unknown as OpenClawConfig` to bypass the type system and inject values the schema forbids. They pass at the unit level but give false confidence — there is no production path where a user's explicit string config survives validation and executes this code.

Remove these tests if the explicit-string option is not intended to be exposed, or widen the type/schema to make the feature actually available.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 253c80847b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +608 to +610
if (nextKeyInfo.apiKey) {
nextAuth.setRuntimeApiKey(cand.provider, nextKeyInfo.apiKey);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Exchange Copilot token before applying fallback credentials

In the fallback-candidate auth path, nextKeyInfo.apiKey is written directly to runtime auth storage, but GitHub Copilot requires exchanging the GitHub token for a short-lived Copilot API token first (the primary path in this same function does that). If a quota error triggers fallback to a github-copilot/* model, compaction will retry with the wrong credential type and fail authentication instead of recovering.

Useful? React with 👍 / 👎.

Comment on lines +670 to +671
tools: builtInTools,
customTools,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Rebuild provider-sanitized tools for each fallback candidate

The fallback loop reuses the same builtInTools/customTools for every candidate, but those tools are prepared once using the original provider. When fallback switches providers (notably to google-gemini-cli), Gemini-specific schema sanitization is not re-applied, so cross-provider fallback can fail with tool-schema validation errors instead of succeeding after a quota/rate-limit retry.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8f68d364ab

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

config: params.config,
// Outer loop: model fallback on quota/rate-limit errors.
// Inner loop: thinking retry on timeout (at most 2 attempts per candidate).
for (let candidateIdx = 0; candidateIdx < allCandidates.length; candidateIdx++) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Increase lock budget for nested compaction retry loops

This new outer candidateIdx loop plus inner thinkAttempt loop allows a single compaction request to run multiple full compactWithSafetyTimeout calls (300s each), but the lock budget is still computed from a single timeout window (resolveSessionLockMaxHoldFromTimeout), so long fallback/thinking-retry runs can exceed the hold limit and trigger watchdog force-release while this function is still mutating the same session via SessionManager. In sessions that hit repeated timeouts or multiple quota fallbacks, that can permit concurrent writers on the same transcript and corrupt ordering/state; the lock hold time should be scaled to worst-case retries (or refreshed between attempts).

Useful? React with 👍 / 👎.

@iamcobolt
Copy link
Copy Markdown
Author

Addressed the unreachable code concern in commit 8f68d36:

Dead explicit-string branch removedcompaction-fallback.ts lines 58–70 (the "provider/model" string handling branch) have been deleted. The warn parameter was also removed from the function signature since it was only used by that branch. JSDoc updated to reflect the two-value contract.

Dead tests removed — the two test cases covering the unreachable explicit-string path ("explicit provider/model string returns that candidate" and "explicit string matching the current model returns []") have been deleted. The test file now has 6 cases covering "off", "fallback", filtering, and edge cases.

transcriptPolicy re-resolution — confirmed already addressed in a previous push (noted in Greptile's summary: "has addressed previous correctness issues (transcriptPolicy re-resolution)").

@greptile-apps review

@iamcobolt iamcobolt marked this pull request as draft March 4, 2026 09:42
@iamcobolt
Copy link
Copy Markdown
Author

Superseded by #34569 — reissued as a clean single commit on a proper feature branch (feat/compaction-fallback). New PR has a clean description, correct before/after screenshots (upstream main baseline), and the AI-disclosure statement.

@iamcobolt iamcobolt closed this Mar 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling docs Improvements or additions to documentation gateway Gateway runtime size: L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant