Skip to content

fix: stabilize code-mode follow-up tool display and replay#80663

Merged
jalehman merged 14 commits into
openclaw:mainfrom
jalehman:buce/code-mode-followups-pr
May 11, 2026
Merged

fix: stabilize code-mode follow-up tool display and replay#80663
jalehman merged 14 commits into
openclaw:mainfrom
jalehman:buce/code-mode-followups-pr

Conversation

@jalehman

@jalehman jalehman commented May 11, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Problem: code-mode follow-up turns could hide bridged inner tool identity and leave strict provider replay exposed to missing or delayed tool-result pairs.
  • Why it matters: follow-up runs could show generic wrapped tool labels, miss embedded Tool Search controls, or replay corrupted session histories after code-mode/tool-search activity.
  • What changed: renders bridged tool calls using native inner-tool labels, keeps Tool Search controls independent/exposed in embedded runs, preserves delayed tool results across display turns, repairs persisted missing code-mode results, and hardens provider replay/tool-call ID handling.
  • What did NOT change (scope boundary): this does not change plugin permission policy, provider auth, or public tool execution capability; the residual lossless-claw copied-transcript repair warning is handled separately in lossless PR Cron jobs with wakeMode: "now" mark complete without invoking agent session #652.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #
  • Related #
  • This PR fixes a bug or regression

Real behavior proof (required for external PRs)

  • Behavior or issue addressed: Code-mode/tool-search bridge events should render with the underlying tool identity, embedded Tool Search controls should be available when configured, and follow-up replay should not manufacture missing tool-result noise when real results exist or can be repaired safely. The broader Telegram verbose/progress-vs-final-answer separation is tracked separately in Telegram: keep verbose tool results separate from final answers #80294.
  • Real environment tested: Local OpenClaw development checkout on macOS 15 / Node 22.17.0 / pnpm 11.0.8.
  • Exact steps or command run after this patch:
    • Rebased branch on latest upstream/main.
    • Ran targeted Vitest shard set for touched Codex, OpenAI provider, gateway, agent transcript/repair, and tool-display surfaces.
    • Ran repo build.
  • Evidence after fix:
    • CI=true OPENCLAW_VITEST_MAX_WORKERS=1 pnpm test extensions/codex/src/app-server/event-projector.test.ts extensions/openai/openai-provider.test.ts src/agents/pi-embedded-runner.guard.test.ts src/agents/pi-embedded-runner.guard.waitforidle-before-flush.test.ts src/agents/pi-embedded-runner/run/attempt.spawn-workspace.context-engine.test.ts src/agents/pi-embedded-runner/run/attempt.transcript-policy.test.ts src/agents/pi-tools.create-openclaw-coding-tools.test.ts src/agents/session-file-repair.test.ts src/agents/session-transcript-repair.test.ts src/agents/tool-display.test.ts src/gateway/server-chat.agent-events.test.ts src/plugins/provider-replay-helpers.test.ts
      • passed 5 Vitest shards in 24.19s.
    • git diff --check upstream/main...HEAD
      • passed with no output.
    • CI=true pnpm build
      • passed, including tsdown, CLI bootstrap import guard, runtime postbuild, plugin SDK dts, plugin SDK export check, and bundled plugin asset copy.
  • Observed result after fix: targeted tests and build pass after rebase; embedded Tool Search enablement test passes; transcript/file repair tests pass; Codex event projector tests pass.
  • What was not tested: live Telegram delivery screenshot on this final rebased branch; earlier live diagnosis drove the patch, but final verification here is unit/seam/build focused.
  • Before evidence: prior behavior was observed as mixed Telegram progress/final output plus live-history missing-tool-result repair noise around code-mode tool calls.

Root Cause (if applicable)

  • Root cause: code-mode bridge projection used wrapper-level display too often; embedded runs did not pass Tool Search enablement through one callsite; transcript repair/pairing paths were too narrow around provider tool-call shapes, delayed tool results, and persisted missing-result repair.
  • Missing detection / guardrail: coverage existed for individual replay paths, but not for bridge projection, embedded Tool Search enablement, or persisted code-mode missing-result repair together.
  • Contributing context (if known): lossless-claw also carried a stale copied transcript repair implementation; that separate copied-logic issue explained some residual warning noise and is not fixed by this OpenClaw PR.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file:
    • extensions/codex/src/app-server/event-projector.test.ts
    • src/gateway/server-chat.agent-events.test.ts
    • src/agents/pi-embedded-runner.guard.test.ts
    • src/agents/pi-embedded-runner.guard.waitforidle-before-flush.test.ts
    • src/agents/pi-embedded-runner/run/attempt.spawn-workspace.context-engine.test.ts
    • src/agents/pi-tools.create-openclaw-coding-tools.test.ts
    • src/agents/session-file-repair.test.ts
    • src/agents/session-transcript-repair.test.ts
    • src/agents/tool-display.test.ts
    • extensions/openai/openai-provider.test.ts
    • src/plugins/provider-replay-helpers.test.ts
  • Scenario the test should lock in: bridged tool events project native labels, Tool Search controls are exposed in embedded runs when configured, and tool-result repair preserves/matches real results before inserting deterministic synthetic repairs.
  • Why this is the smallest reliable guardrail: these tests exercise the exact projection, gateway event, embedded-runner, and transcript repair seams without broad E2E setup.
  • Existing test that already covers this (if any): partial coverage existed for replay repair; this PR extends targeted coverage around the missing seams.
  • If no new test is added, why not: N/A.

User-visible / Behavior Changes

  • Bridged code-mode tool calls should display as the underlying tool instead of generic wrapper labels.
  • Follow-up turns after code-mode/tool-search activity should be more robust against malformed or incomplete persisted session history.

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: macOS 15 / Darwin 24.6.0 arm64
  • Runtime/container: Node 22.17.0, pnpm 11.0.8
  • Model/provider: OpenAI-compatible/Codex-oriented code-mode surfaces; provider replay tests include OpenAI-compatible helpers.
  • Integration/channel (if any): Telegram/gateway event delivery and Codex app-server projection surfaces.
  • Relevant config (redacted): Tool Search enabled in embedded-runner test fixtures; no secrets printed.

Steps

  1. Run the targeted test command listed in Real behavior proof.
  2. Run git diff --check upstream/main...HEAD.
  3. Run CI=true pnpm build.

Expected

  • Targeted tests pass.
  • Diff check reports no whitespace/conflict-marker issues.
  • Build completes successfully.

Actual

  • Targeted tests passed: 5 Vitest shards in 24.19s.
  • Diff check passed with no output.
  • Build completed successfully.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

  • Verified scenarios: targeted tests for Codex event projection, gateway agent events, embedded Tool Search enablement, transcript/file repair, tool-display rendering, provider replay helpers, and build after rebase. Broader Telegram verbose/final separation remains covered by Telegram: keep verbose tool results separate from final answers #80294.
  • Edge cases checked: delayed tool results across display turns; persisted missing code-mode results; embedded Tool Search controls; OpenAI-compatible replay payload handling; bridged tool display labels.
  • What you did not verify: live Telegram screenshot on the final rebased branch.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)
  • If yes, exact upgrade steps: N/A

Risks and Mitigations

  • Risk: transcript repair could insert synthetic tool results too eagerly.
    • Mitigation: repair checks for matching real results globally and tests cover missing, delayed, duplicate, and orphan result behavior.
  • Risk: display projection could expose wrapper/internal labels incorrectly.
    • Mitigation: shared display helper and Codex/gateway/tool-display tests lock native inner-tool projection behavior.

@openclaw-barnacle openclaw-barnacle Bot added gateway Gateway runtime agents Agent runtime and tooling extensions: openai extensions: codex size: L maintainer Maintainer-authored PR labels May 11, 2026
@clawsweeper

clawsweeper Bot commented May 11, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge.

Summary
Review failed before ClawSweeper could summarize the requested change.

Reproducibility: unclear. The review failed before ClawSweeper could establish a reproduction path.

Real behavior proof
Not applicable: Real behavior proof was not assessed because the Codex review failed.

Next step before merge
Review did not complete, so no work-lane recommendation was made.

Review details

Best possible solution:

Retry the Codex review after fixing the execution failure.

Do we have a high-confidence way to reproduce the issue?

Unclear. The review failed before ClawSweeper could establish a reproduction path.

Is this the best way to solve the issue?

Unclear. Retry the review first so ClawSweeper can evaluate the actual issue and fix direction.

What I checked:

  • failure reason: timeout.
  • codex failure detail: Codex review failed for this PR: spawnSync codex ETIMEDOUT.

Likely related people:

  • unknown: Codex failed before it could trace repository history. (role: review did not complete; confidence: low)

Remaining risk / open question:

  • No close action taken because the review did not complete.

Codex review notes: model gpt-5.5, reasoning high; reviewed against fe0bb3083e99.

@clawsweeper clawsweeper Bot added the mantis: telegram-visible-proof Mantis should capture Telegram visible proof. label May 11, 2026
@jalehman jalehman self-assigned this May 11, 2026
@jalehman jalehman force-pushed the buce/code-mode-followups-pr branch from e3bd15e to 731f13c Compare May 11, 2026 21:07
@clawsweeper clawsweeper Bot removed the mantis: telegram-visible-proof Mantis should capture Telegram visible proof. label May 11, 2026
@jalehman jalehman force-pushed the buce/code-mode-followups-pr branch from 731f13c to c14df1c Compare May 11, 2026 22:23
@jalehman jalehman merged commit 4bfd741 into openclaw:main May 11, 2026
85 of 87 checks passed
@jalehman jalehman deleted the buce/code-mode-followups-pr branch May 11, 2026 22:31
steipete added a commit that referenced this pull request May 12, 2026
* fix: project tool-search bridge event display

* fix: keep codex tool progress out of final replies

* fix: preserve tool result pairs on cleanup

* fix: restore tool search display target helper

* fix: keep tool search controls independent

* fix: render bridged tool calls like native tools

* fix: abort timed out tool search bridge calls

* fix: preserve code-mode tool results across display turns

* fix: repair missing code-mode tool results on disk

* fix: expose tool search controls in embedded runs

* docs: add code-mode followups changelog

* fix: update session repair agent-core import

* fix: harden code-mode follow-up repair

* fix: use stable session repair ids

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
# Conflicts:
#	src/agents/pi-embedded-runner/run/attempt.subscription-cleanup.ts
#	src/agents/pi-embedded-runner/run/attempt.ts
#	src/agents/transcript-state-repair.ts
steipete added a commit that referenced this pull request May 12, 2026
* fix: project tool-search bridge event display

* fix: keep codex tool progress out of final replies

* fix: preserve tool result pairs on cleanup

* fix: restore tool search display target helper

* fix: keep tool search controls independent

* fix: render bridged tool calls like native tools

* fix: abort timed out tool search bridge calls

* fix: preserve code-mode tool results across display turns

* fix: repair missing code-mode tool results on disk

* fix: expose tool search controls in embedded runs

* docs: add code-mode followups changelog

* fix: update session repair agent-core import

* fix: harden code-mode follow-up repair

* fix: use stable session repair ids

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
# Conflicts:
#	src/agents/pi-embedded-runner/run/attempt.subscription-cleanup.ts
#	src/agents/pi-embedded-runner/run/attempt.ts
#	src/agents/transcript-state-repair.ts
steipete added a commit that referenced this pull request May 13, 2026
* fix: project tool-search bridge event display

* fix: keep codex tool progress out of final replies

* fix: preserve tool result pairs on cleanup

* fix: restore tool search display target helper

* fix: keep tool search controls independent

* fix: render bridged tool calls like native tools

* fix: abort timed out tool search bridge calls

* fix: preserve code-mode tool results across display turns

* fix: repair missing code-mode tool results on disk

* fix: expose tool search controls in embedded runs

* docs: add code-mode followups changelog

* fix: update session repair agent-core import

* fix: harden code-mode follow-up repair

* fix: use stable session repair ids

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
# Conflicts:
#	src/agents/pi-embedded-runner/run/attempt.subscription-cleanup.ts
#	src/agents/pi-embedded-runner/run/attempt.ts
#	src/agents/transcript-state-repair.ts
steipete added a commit that referenced this pull request May 13, 2026
* fix: project tool-search bridge event display

* fix: keep codex tool progress out of final replies

* fix: preserve tool result pairs on cleanup

* fix: restore tool search display target helper

* fix: keep tool search controls independent

* fix: render bridged tool calls like native tools

* fix: abort timed out tool search bridge calls

* fix: preserve code-mode tool results across display turns

* fix: repair missing code-mode tool results on disk

* fix: expose tool search controls in embedded runs

* docs: add code-mode followups changelog

* fix: update session repair agent-core import

* fix: harden code-mode follow-up repair

* fix: use stable session repair ids

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
# Conflicts:
#	src/agents/pi-embedded-runner/run/attempt.subscription-cleanup.ts
#	src/agents/pi-embedded-runner/run/attempt.ts
#	src/agents/transcript-state-repair.ts
steipete added a commit that referenced this pull request May 13, 2026
* fix: project tool-search bridge event display

* fix: keep codex tool progress out of final replies

* fix: preserve tool result pairs on cleanup

* fix: restore tool search display target helper

* fix: keep tool search controls independent

* fix: render bridged tool calls like native tools

* fix: abort timed out tool search bridge calls

* fix: preserve code-mode tool results across display turns

* fix: repair missing code-mode tool results on disk

* fix: expose tool search controls in embedded runs

* docs: add code-mode followups changelog

* fix: update session repair agent-core import

* fix: harden code-mode follow-up repair

* fix: use stable session repair ids

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
# Conflicts:
#	src/agents/pi-embedded-runner/run/attempt.subscription-cleanup.ts
#	src/agents/pi-embedded-runner/run/attempt.ts
#	src/agents/transcript-state-repair.ts
steipete added a commit that referenced this pull request May 13, 2026
* fix: project tool-search bridge event display

* fix: keep codex tool progress out of final replies

* fix: preserve tool result pairs on cleanup

* fix: restore tool search display target helper

* fix: keep tool search controls independent

* fix: render bridged tool calls like native tools

* fix: abort timed out tool search bridge calls

* fix: preserve code-mode tool results across display turns

* fix: repair missing code-mode tool results on disk

* fix: expose tool search controls in embedded runs

* docs: add code-mode followups changelog

* fix: update session repair agent-core import

* fix: harden code-mode follow-up repair

* fix: use stable session repair ids

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
# Conflicts:
#	src/agents/pi-embedded-runner/run/attempt.subscription-cleanup.ts
#	src/agents/pi-embedded-runner/run/attempt.ts
#	src/agents/transcript-state-repair.ts
steipete added a commit that referenced this pull request May 13, 2026
* fix: project tool-search bridge event display

* fix: keep codex tool progress out of final replies

* fix: preserve tool result pairs on cleanup

* fix: restore tool search display target helper

* fix: keep tool search controls independent

* fix: render bridged tool calls like native tools

* fix: abort timed out tool search bridge calls

* fix: preserve code-mode tool results across display turns

* fix: repair missing code-mode tool results on disk

* fix: expose tool search controls in embedded runs

* docs: add code-mode followups changelog

* fix: update session repair agent-core import

* fix: harden code-mode follow-up repair

* fix: use stable session repair ids

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
# Conflicts:
#	src/agents/pi-embedded-runner/run/attempt.subscription-cleanup.ts
#	src/agents/pi-embedded-runner/run/attempt.ts
#	src/agents/transcript-state-repair.ts
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 24, 2026
…80663)

* fix: project tool-search bridge event display

* fix: keep codex tool progress out of final replies

* fix: preserve tool result pairs on cleanup

* fix: restore tool search display target helper

* fix: keep tool search controls independent

* fix: render bridged tool calls like native tools

* fix: abort timed out tool search bridge calls

* fix: preserve code-mode tool results across display turns

* fix: repair missing code-mode tool results on disk

* fix: expose tool search controls in embedded runs

* docs: add code-mode followups changelog

* fix: update session repair agent-core import

* fix: harden code-mode follow-up repair

* fix: use stable session repair ids

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
jameslcowan pushed a commit to jameslcowan/openclaw that referenced this pull request Jun 2, 2026
…80663)

* fix: project tool-search bridge event display

* fix: keep codex tool progress out of final replies

* fix: preserve tool result pairs on cleanup

* fix: restore tool search display target helper

* fix: keep tool search controls independent

* fix: render bridged tool calls like native tools

* fix: abort timed out tool search bridge calls

* fix: preserve code-mode tool results across display turns

* fix: repair missing code-mode tool results on disk

* fix: expose tool search controls in embedded runs

* docs: add code-mode followups changelog

* fix: update session repair agent-core import

* fix: harden code-mode follow-up repair

* fix: use stable session repair ids

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
sablehead pushed a commit to sablehead/openclaw that referenced this pull request Jun 10, 2026
…80663)

* fix: project tool-search bridge event display

* fix: keep codex tool progress out of final replies

* fix: preserve tool result pairs on cleanup

* fix: restore tool search display target helper

* fix: keep tool search controls independent

* fix: render bridged tool calls like native tools

* fix: abort timed out tool search bridge calls

* fix: preserve code-mode tool results across display turns

* fix: repair missing code-mode tool results on disk

* fix: expose tool search controls in embedded runs

* docs: add code-mode followups changelog

* fix: update session repair agent-core import

* fix: harden code-mode follow-up repair

* fix: use stable session repair ids

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling extensions: codex extensions: openai gateway Gateway runtime maintainer Maintainer-authored PR size: L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants