Conversation
|
Codex review: needs changes before merge. What this changes: This PR wraps implicit embedded compaction in the session model fallback chain, threads per-run fallback override state through agent execution and compaction runtime context, preserves failure metadata, and updates docs, changelog, and focused tests. Required change before merge: The remaining blocker is a narrow mechanical PR-branch repair with clear files and tests; a maintainer can promote it despite the protected label if they want automation to update the branch. Security review: Security review cleared: The diff is limited to runtime fallback wiring, tests, docs, and changelog; it does not add dependencies, CI execution, package lifecycle hooks, secret handling, or downloaded code paths. Review findings:
Review detailsBest possible solution: Carry Do we have a high-confidence way to reproduce the issue? Yes. A source-level reproduction is clear from current main plus the PR diff: Is this the best way to solve the issue? No. The PR's fallback wrapper, failure metadata preservation, and docs direction are appropriate, but the proposed patch is incomplete until the overflow runner path carries the same override and has runner-level coverage. Full review comments:
Overall correctness: patch is incorrect Acceptance criteria:
What I checked:
Likely related people:
Remaining risk / open question:
Codex review notes: model gpt-5.5, reasoning high; reviewed against c6cb7b4801c5. |
Greptile SummaryThis PR wraps implicit embedded compaction in the session model fallback chain so Azure content-filter 400s can recover by retrying on a different model, while keeping explicit Confidence Score: 4/5Safe to merge; logic is correct for the targeted Azure content-filter scenario and the explicit compaction model path is unchanged. No P0/P1 issues found. The fallback classification path in classifyCompactionFallbackResult relies on the error message retaining the HTTP status code prefix (e.g. '400 …') after being serialised through formatErrorMessage → fail(), which holds for Azure's error format and is validated by the tests. The implementation correctly gates on hasExplicitCompactionModel and hasCompactionModelFallbackCandidates, releases the session write lock before each retry, and never writes the fallback choice back to config. src/agents/pi-embedded-runner/compact.ts — specifically classifyCompactionFallbackResult and the double call to resolveCompactionFallbacksOverride / resolveEmbeddedCompactionTarget. Prompt To Fix All With AIThis is a comment left during a code review.
Path: src/agents/pi-embedded-runner/compact.ts
Line: 362-363
Comment:
**Status code lost when classifying serialised failure reasons**
`coerceToFailoverError(new Error(reason), …)` receives only the message string produced by `formatErrorMessage`; any `.status` property on the original exception is discarded by the outer `catch` → `fail(reason)` path in `compactEmbeddedPiSessionDirectOnce`. Classification therefore falls back to `inferSignalStatus`, which parses a leading numeric prefix from the message (e.g. `"400 The response was filtered…"`). This works for the Azure content-filter error targeted by this PR, but an error whose message does not carry the status prefix (e.g. `new Error("Content policy violation")` with `status: 400`) would produce a `null` failover classification and silently short-circuit as a `classifyResult` success, returning `{ ok: false }` to the caller without attempting any fallback model.
Consider preserving the status code in the serialised reason (e.g. via `resolveCompactionFailureReason` or by also checking `err.status` in `classifyCompactionFallbackResult`) so the gate is not sensitive to provider-specific message formatting.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/compact.ts
Line: 382-395
Comment:
**`resolveCompactionFallbacksOverride` and `resolveEmbeddedCompactionTarget` called twice**
`resolveCompactionFallbacksOverride` is invoked once inside `hasCompactionModelFallbackCandidates` and again five lines later; `resolveEmbeddedCompactionTarget` is called here to extract `primaryProvider`/`primaryModel` and then called a second time inside every `compactEmbeddedPiSessionDirectOnce` invocation in the fallback loop. Both are pure/side-effect-free reads, so this is harmless, but capturing the results once would remove the redundancy and make the hot path slightly clearer.
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "fix: add compaction model fallback" | Re-trigger Greptile |
| const failoverError = coerceToFailoverError(new Error(reason), { provider, model }); | ||
| return failoverError ? { error: failoverError } : null; |
There was a problem hiding this comment.
Status code lost when classifying serialised failure reasons
coerceToFailoverError(new Error(reason), …) receives only the message string produced by formatErrorMessage; any .status property on the original exception is discarded by the outer catch → fail(reason) path in compactEmbeddedPiSessionDirectOnce. Classification therefore falls back to inferSignalStatus, which parses a leading numeric prefix from the message (e.g. "400 The response was filtered…"). This works for the Azure content-filter error targeted by this PR, but an error whose message does not carry the status prefix (e.g. new Error("Content policy violation") with status: 400) would produce a null failover classification and silently short-circuit as a classifyResult success, returning { ok: false } to the caller without attempting any fallback model.
Consider preserving the status code in the serialised reason (e.g. via resolveCompactionFailureReason or by also checking err.status in classifyCompactionFallbackResult) so the gate is not sensitive to provider-specific message formatting.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/compact.ts
Line: 362-363
Comment:
**Status code lost when classifying serialised failure reasons**
`coerceToFailoverError(new Error(reason), …)` receives only the message string produced by `formatErrorMessage`; any `.status` property on the original exception is discarded by the outer `catch` → `fail(reason)` path in `compactEmbeddedPiSessionDirectOnce`. Classification therefore falls back to `inferSignalStatus`, which parses a leading numeric prefix from the message (e.g. `"400 The response was filtered…"`). This works for the Azure content-filter error targeted by this PR, but an error whose message does not carry the status prefix (e.g. `new Error("Content policy violation")` with `status: 400`) would produce a `null` failover classification and silently short-circuit as a `classifyResult` success, returning `{ ok: false }` to the caller without attempting any fallback model.
Consider preserving the status code in the serialised reason (e.g. via `resolveCompactionFailureReason` or by also checking `err.status` in `classifyCompactionFallbackResult`) so the gate is not sensitive to provider-specific message formatting.
How can I resolve this? If you propose a fix, please make it concise.| if (hasExplicitCompactionModel(params) || !hasCompactionModelFallbackCandidates(params)) { | ||
| return await compactEmbeddedPiSessionDirectOnce(params); | ||
| } | ||
| const resolvedCompactionTarget = resolveEmbeddedCompactionTarget({ | ||
| config: params.config, | ||
| provider: params.provider, | ||
| modelId: params.model, | ||
| authProfileId: params.authProfileId, | ||
| defaultProvider: DEFAULT_PROVIDER, | ||
| defaultModel: DEFAULT_MODEL, | ||
| }); | ||
| const primaryProvider = resolvedCompactionTarget.provider ?? DEFAULT_PROVIDER; | ||
| const primaryModel = resolvedCompactionTarget.model ?? DEFAULT_MODEL; | ||
| const fallbacksOverride = resolveCompactionFallbacksOverride(params); |
There was a problem hiding this comment.
resolveCompactionFallbacksOverride and resolveEmbeddedCompactionTarget called twice
resolveCompactionFallbacksOverride is invoked once inside hasCompactionModelFallbackCandidates and again five lines later; resolveEmbeddedCompactionTarget is called here to extract primaryProvider/primaryModel and then called a second time inside every compactEmbeddedPiSessionDirectOnce invocation in the fallback loop. Both are pure/side-effect-free reads, so this is harmless, but capturing the results once would remove the redundancy and make the hot path slightly clearer.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/compact.ts
Line: 382-395
Comment:
**`resolveCompactionFallbacksOverride` and `resolveEmbeddedCompactionTarget` called twice**
`resolveCompactionFallbacksOverride` is invoked once inside `hasCompactionModelFallbackCandidates` and again five lines later; `resolveEmbeddedCompactionTarget` is called here to extract `primaryProvider`/`primaryModel` and then called a second time inside every `compactEmbeddedPiSessionDirectOnce` invocation in the fallback loop. Both are pure/side-effect-free reads, so this is harmless, but capturing the results once would remove the redundancy and make the hot path slightly clearer.
How can I resolve this? If you propose a fix, please make it concise.5bc29d3 to
9a68140
Compare
b83e546 to
7f003d4
Compare
7f003d4 to
1c6b3a2
Compare
* fix: add compaction model fallback * docs: add compaction changelog pr reference * docs: add compaction changelog author * docs: satisfy compaction changelog attribution * fix: preserve compaction fallback metadata * fix: satisfy compaction fallback lint * docs: move compaction fallback changelog entry
* fix: add compaction model fallback * docs: add compaction changelog pr reference * docs: add compaction changelog author * docs: satisfy compaction changelog attribution * fix: preserve compaction fallback metadata * fix: satisfy compaction fallback lint * docs: move compaction fallback changelog entry
What
Implicit embedded compaction now retries fallback-eligible summarization failures through the active session model fallback chain. Explicit
agents.defaults.compaction.modelremains exact and does not inherit the session fallback chain.Why
Azure-hosted compaction can reject summarization prompts with content-filter 400s. Without model fallback, overflow recovery can keep retrying the same rejected model. Closes #64960.
Changes
Testing
pnpm test src/agents/pi-embedded-runner/compact.hooks.test.ts -- --reporter=verbosepnpm test src/auto-reply/reply/agent-runner-utils.test.ts src/agents/command/attempt-execution.cli.test.ts -- --reporter=verbosepnpm exec oxfmt --check --threads=1 <touched files>pnpm check:changedNote: Testbox
pnpm check:changedwas attempted per maintainer policy, but the local environment was not authenticated with Blacksmith after installing the CLI.