Skip to content

fix Codex preflight compaction auth routing#86292

Closed
amknight wants to merge 1 commit into
mainfrom
codex/fix-codex-preflight-compaction-auth
Closed

fix Codex preflight compaction auth routing#86292
amknight wants to merge 1 commit into
mainfrom
codex/fix-codex-preflight-compaction-auth

Conversation

@amknight

Copy link
Copy Markdown
Member

Summary

  • Preserve the persisted session runtime route for budget preflight compaction when resuming the same session, including Codex model provider, model, auth profile, and harness id.
  • Add a unit regression and a runReplyAgent e2e repro for a Codex-backed Slack session whose current run route is openai/gpt-5.5.
  • Add an unreleased changelog entry.

Verification

  • git diff --check origin/main...HEAD
  • node scripts/run-vitest.mjs src/auto-reply/reply/agent-runner-memory.test.ts
  • node scripts/run-vitest.mjs run --config test/vitest/vitest.e2e.config.ts src/auto-reply/reply/agent-runner.runreplyagent.e2e.test.ts

Real behavior proof

Behavior addressed: Budget preflight compaction for existing Codex-backed Slack sessions now uses the persisted Codex session provider/model/auth/harness route instead of falling back to the current run's openai/gpt-5.5 route and requiring separate plain OpenAI auth.

Real environment tested: Local OpenClaw Codex worktree rebased on origin/main at 4798264a29, using the existing dependency install via a symlinked node_modules; the repro avoids live Slack/API calls by mocking the embedded PI runtime boundary.

Exact steps or command run after this patch: git diff --check origin/main...HEAD; node scripts/run-vitest.mjs src/auto-reply/reply/agent-runner-memory.test.ts; node scripts/run-vitest.mjs run --config test/vitest/vitest.e2e.config.ts src/auto-reply/reply/agent-runner.runreplyagent.e2e.test.ts.

Evidence after fix: The unit regression confirms compactEmbeddedPiSession receives provider: "codex", model: "gpt-5.5", authProfileId: "codex:work", and agentHarnessId: "codex"; the e2e repro confirms runReplyAgent preflight compaction uses codex/gpt-5.5 while the actual reply run remains openai/gpt-5.5.

Observed result after fix: The focused unit test file passed 25/25 tests, and the e2e test file passed 56/56 tests.

What was not tested: A live Slack/gateway/API compaction against a real auth store, and a broad changed-gate/Testbox run.

@openclaw-barnacle openclaw-barnacle Bot added size: S maintainer Maintainer-authored PR labels May 25, 2026
@clawsweeper

clawsweeper Bot commented May 25, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge. Reviewed May 24, 2026, 10:39 PM ET / 02:39 UTC.

Summary
The branch routes budget preflight compaction through the matching persisted session provider, model, auth profile, and harness id, with unit/e2e regressions and a changelog entry.

PR surface: Source +27, Tests +133, Docs +1. Total +161 across 4 files.

Reproducibility: yes. by source and focused test path: current main's preflight compaction call uses the follow-up run route, while the PR constructs an over-threshold persisted Codex session and asserts the compaction call receives the persisted route. I did not run the tests in this read-only review.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🐚 platinum hermit
Patch quality: 🦞 diamond lobster
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • Confirm whether this PR or the broader compaction-boundary PR owns the canonical fix before merge.

Risk before merge

  • This intentionally changes auth/provider selection for budget preflight compaction, so the landing order with the broader compaction-routing and Codex-boundary PRs could affect which runtime owns compaction for existing sessions.
  • The PR proof is focused unit/e2e coverage with a mocked embedded PI boundary and no live Slack/gateway/auth-store compaction, so maintainers should decide whether that is sufficient for this protected auth/session route fix.

Maintainer options:

  1. Land after route-safety review
    Maintainers can land this narrow fix after confirming it remains the intended near-term behavior alongside the open compaction boundary and runtime-routing PRs.
  2. Fold into the broader routing PR
    If the broader OpenAI compaction runtime-routing PR becomes canonical, transplant this persisted-session regression coverage there instead of landing two overlapping fixes.
  3. Supersede if Codex preflight goes away
    If the Codex-boundary PR lands first and removes automatic OpenClaw-owned Codex preflight compaction, close this PR as superseded while keeping any still-useful regression coverage.

Next step before merge
Protected maintainer PR plus overlapping auth/session compaction work requires human landing-order and boundary review, not automated cleanup or repair.

Security
Cleared: The diff does not add dependencies, scripts, workflows, or new secret storage; the auth-route change is covered as merge risk rather than a concrete security defect.

Review details

Best possible solution:

Choose one canonical Codex/OpenAI compaction-routing path that preserves existing session auth/runtime state, then land this narrow regression or fold its test coverage into the broader compaction-boundary fix.

Do we have a high-confidence way to reproduce the issue?

Yes, by source and focused test path: current main's preflight compaction call uses the follow-up run route, while the PR constructs an over-threshold persisted Codex session and asserts the compaction call receives the persisted route. I did not run the tests in this read-only review.

Is this the best way to solve the issue?

Unclear as a final architecture choice: the patch is a narrow fix for current main's route mismatch, but sibling PRs may instead make Codex preflight compaction native-owned or handle broader OpenAI/Codex auth routing.

Codex review notes: model gpt-5.5, reasoning high; reviewed against ca70015a7ceb.

Label changes

Label changes:

  • add P1: The PR targets a broken agent/channel compaction route that can block Codex-backed Slack replies through auth-provider mismatch.
  • add merge-risk: 🚨 session-state: The diff reuses persisted session runtime state during compaction, which must stay aligned with session ids and sibling compaction behavior.
  • add merge-risk: 🚨 auth-provider: The diff changes which provider/auth profile budget preflight compaction uses for existing sessions.

Label justifications:

  • P1: The PR targets a broken agent/channel compaction route that can block Codex-backed Slack replies through auth-provider mismatch.
  • merge-risk: 🚨 auth-provider: The diff changes which provider/auth profile budget preflight compaction uses for existing sessions.
  • merge-risk: 🚨 session-state: The diff reuses persisted session runtime state during compaction, which must stay aligned with session ids and sibling compaction behavior.
  • rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🐚 platinum hermit and patch quality is 🦞 diamond lobster.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Not applicable: This is a maintainer/member PR, so the contributor real-behavior-proof gate is not applied; the body provides focused mocked unit/e2e proof but no live Slack/auth-store run.
Evidence reviewed

PR surface:

Source +27, Tests +133, Docs +1. Total +161 across 4 files.

View PR surface stats
Area Files Added Removed Net
Source 1 31 4 +27
Tests 2 133 0 +133
Docs 1 1 0 +1
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 4 165 4 +161

What I checked:

  • Protected maintainer item: The GitHub context marks the PR author association as MEMBER and includes the protected maintainer label, so cleanup closure is not allowed by the review policy.
  • Current main still has the old preflight route: On current main, runPreflightCompactionIfNeeded passes compactEmbeddedPiSession the followup run provider/model and only the session agentHarnessId; it does not use the persisted session modelProvider/model/authProfileOverride route that this PR adds. (src/auto-reply/reply/agent-runner-memory.ts:756, ca70015a7ceb)
  • Session store has the fields the PR preserves: SessionEntry persists authProfileOverride, modelProvider, model, and agentHarnessId, which are the route fields the proposed preflight helper forwards when the session id matches. (src/config/sessions/types.ts:308, ca70015a7ceb)
  • Proposed diff is focused: The PR commit adds resolvePreflightCompactionRunTarget, forwards provider/model/authProfileId/agentHarnessId to compactEmbeddedPiSession, and adds focused unit plus runReplyAgent regression coverage for the Codex-backed Slack route. (src/auto-reply/reply/agent-runner-memory.ts, 66354a42582b)
  • Sibling compaction work overlaps the route decision: Provided related-item context shows open PRs covering broader OpenAI compaction runtime routing, Codex compaction boundary behavior, and CLI runtime preflight gates, so landing order and intended boundary need maintainer coordination.
  • History provenance: Blame ties the current preflight compaction call shape to recent main history, and file history shows repeated recent work on memory flush, preflight compaction, context windows, and embedded compaction routing. (src/auto-reply/reply/agent-runner-memory.ts:756, 5c7980fa1132)

Likely related people:

  • Omar Shahine: git blame attributes the current preflight compaction call block in agent-runner-memory.ts to commit 5c7980f, which carried the current main shape being repaired. (role: introduced current preflight route shape; confidence: medium; commits: 5c7980fa1132; files: src/auto-reply/reply/agent-runner-memory.ts)
  • Peter Steinberger: Recent history shows repeated work on memory flush policy, compaction runtime, embedded runner auth/runtime, and session routing surfaces adjacent to this PR. (role: recent area contributor; confidence: high; commits: a374c3a5bfd5, c90cb9c3c95c, 37625cff6fff; files: src/auto-reply/reply/agent-runner-memory.ts, src/agents/pi-embedded-runner/compact.ts)
  • Jared: History shows the transcript-estimate preflight compaction behavior was added in commit c6d8318, making this person relevant to the threshold and preflight route path. (role: preflight compaction contributor; confidence: medium; commits: c6d8318d07f5; files: src/auto-reply/reply/agent-runner-memory.ts)
  • Vincent Koc: History shows recent embedded compaction wrapper and provider/runtime seam work on compact.ts, which is the callee affected by the new preflight route parameters. (role: adjacent runtime contributor; confidence: medium; commits: f1c4e2f11daf, 74e7b8d47b18; files: src/agents/pi-embedded-runner/compact.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@amknight amknight marked this pull request as ready for review May 25, 2026 02:29
@clawsweeper clawsweeper Bot added rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. labels May 25, 2026
@clawsweeper

clawsweeper Bot commented May 25, 2026

Copy link
Copy Markdown
Contributor

ClawSweeper PR egg

✨ Hatched: 🥚 common Neon Signal Puff

Hatch command

Comment @clawsweeper hatch when this PR is hatchable.

Hatchability rules:

  • Merged PRs are hatchable.
  • Open PRs are hatchable when they are status: 👀 ready for maintainer look, status: 🚀 automerge armed, or labeled clawsweeper:automerge.
  • Closed unmerged PRs are hatchable only when one of those hatchable labels is still present in the durable record.

Rarity: 🥚 common.
Trait: sleeps inside passing CI.
Image traits: location proof lagoon; accessory lint brush; palette amber, ink, and glacier blue; mood patient; pose guarding a tiny green check; shell matte ceramic shell; lighting moonlit rim light; background tiny shells and proof notes.
Share on X: post this hatch
Copy: My PR egg hatched a 🥚 common Neon Signal Puff in ClawSweeper.

What is this egg doing here?
  • Eggs appear after the PR passes real-behavior proof. It is here for vibes, not verdicts: it does not change labels, ratings, merge decisions, or automation.
  • The shell reacts to review momentum: open follow-up work warms it up, re-review makes it wobble, and a clean final review lets it hatch.
  • Hatchability usually comes from sufficient real-behavior proof, no blocking P0/P1/P2 findings, no security attention needed, and clean correctness. A merged PR is already final, so merge makes the egg hatchable independently.
  • The hatch is seeded from this repository and PR number, so the same PR keeps the same creature; the reviewed head SHA can only change safe visual details.
  • Rarity is just collectible sparkle: 🥚 common, 🌱 uncommon, 💎 rare, ✨ glimmer, and 🌈 legendary.

@clawsweeper clawsweeper Bot added P1 High-priority user-facing bug, regression, or broken workflow. merge-risk: 🚨 session-state 🚨 May lose, corrupt, stale, or mis-associate session, agent, or context state. merge-risk: 🚨 auth-provider 🚨 May break OAuth, tokens, provider routing, model choice, or credentials. labels May 25, 2026
@amknight

Copy link
Copy Markdown
Member Author

Closing as superseded by #85958.

#85958 landed the broader Codex compaction boundary fix in dd47e47: Codex runtime sessions now leave automatic compaction to native Codex, and OpenClaw no longer runs reply preflight compaction for them. This PR fixed the older preflight route mismatch by preserving the persisted Codex provider/model/auth/harness route, but that path is no longer the canonical behavior after #85958.

@amknight amknight closed this May 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

maintainer Maintainer-authored PR merge-risk: 🚨 auth-provider 🚨 May break OAuth, tokens, provider routing, model choice, or credentials. merge-risk: 🚨 session-state 🚨 May lose, corrupt, stale, or mis-associate session, agent, or context state. P1 High-priority user-facing bug, regression, or broken workflow. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. size: S status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant