fix(agents): preserve stop-finished OpenAI tool calls by MonkeyLeeT · Pull Request #88804 · openclaw/openclaw

MonkeyLeeT · 2026-05-31T22:20:29Z

Summary

Preserve silent structured OpenAI-compatible tool calls when providers stream tool_calls but finish with finish_reason: stop.
Keep stripping tool calls for visible-text stop responses, length/error completions, and transport streams that never emitted a final finish reason.
Cover both the transport stream assembler and the legacy openai-completions provider path.

Verification

pnpm format -- src/agents/openai-transport-stream.ts src/agents/openai-transport-stream.test.ts src/llm/providers/openai-completions.ts src/llm/providers/openai-completions.test.ts
node scripts/run-vitest.mjs src/llm/providers/openai-completions.test.ts src/agents/openai-transport-stream.test.ts
.agents/skills/autoreview/scripts/autoreview --mode branch --base origin/main

Real behavior proof

Behavior addressed: structured tool_calls paired with finish_reason: stop no longer get stripped into a silent terminal assistant turn when there is no visible assistant text.

Real environment tested: local OpenClaw worktree running the actual OpenAI-compatible transport stream function against a loopback OpenAI-compatible SSE endpoint, with no provider credentials. Proof run timestamp: 2026-05-31 15:27 PDT.

Exact steps or command run after this patch: node --import tsx /private/tmp/openclaw-real-proof-88791.mjs

Evidence after fix: terminal output copied from the after-fix runtime proof:

local OpenAI-compatible SSE endpoint: http://127.0.0.1:57954/v1
silent-stop-tool-call -> stopReason=toolUse; content=[toolCall:bash:{"cmd":"echo after-fix"}]
visible-stop-tool-call -> stopReason=stop; content=[text:"Visible answer."]
missing-finish-tool-call -> stopReason=stop; content=[]

Observed result after fix: the OpenClaw runtime preserved the silent structured tool call as toolUse, kept the visible-text stop response deliverable as text, and dropped the unfinished tool call when the stream ended without a final finish reason.

What was not tested: external vLLM/Qwen provider behavior and broad CI.

clawsweeper · 2026-05-31T22:23:21Z

Codex review: needs maintainer review before merge. Reviewed May 31, 2026, 6:47 PM ET / 22:47 UTC.

Summary
The branch changes OpenAI-compatible stream finalizers and tests to promote silent finish_reason: stop tool-call responses to toolUse while preserving visible-text, length, and missing-finish behavior.

PR surface: Source +16, Tests +143. Total +159 across 4 files.

Reproducibility: yes. source-level reproduction is high-confidence: on current main a stream with tool-call blocks plus finish_reason: stop keeps stopReason: stop and then strips the tool calls before terminal classification. I did not run tests because this review was constrained to read-only commands.

Review metrics: none identified.

Merge readiness
Overall: 🦞 diamond lobster
Proof: 🦞 diamond lobster
Patch quality: 🦞 diamond lobster
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Risk before merge

[P1] The proof does not exercise the reporter's external vLLM/Qwen deployment or broad CI; the maintainer decision is whether the loopback wire-shape proof plus focused tests are enough for this provider-runtime bugfix.

Maintainer options:

Decide the mitigation before merge
Land the focused finalizer fix after maintainer review and CI if the loopback proof is accepted, or request one reporter-provider smoke if maintainers want exact vLLM/Qwen confirmation before merge.
Pause or close
Do not merge this PR until maintainers decide whether the risk is worth taking.

Next step before merge

No automated repair is needed; the branch has no blocking findings and the remaining action is normal maintainer review, CI, and proof acceptance.

Security
Cleared: The diff changes TypeScript stream finalization and tests only; it does not touch dependencies, workflows, secrets, install scripts, or other supply-chain surfaces.

Review details

Best possible solution:

Land the focused finalizer fix after maintainer review and CI if the loopback proof is accepted, or request one reporter-provider smoke if maintainers want exact vLLM/Qwen confirmation before merge.

Do we have a high-confidence way to reproduce the issue?

Yes, source-level reproduction is high-confidence: on current main a stream with tool-call blocks plus finish_reason: stop keeps stopReason: stop and then strips the tool calls before terminal classification. I did not run tests because this review was constrained to read-only commands.

Is this the best way to solve the issue?

Yes, this is the right layer: finalization already owns the symmetric toolUse/tool-call consistency guard, and the branch updates both affected OpenAI-compatible paths while preserving visible-text, length, and missing-finish behavior.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against bb673f47b207.

Label changes

Label changes:

add rating: 🦞 diamond lobster: Overall readiness is 🦞 diamond lobster; proof is 🦞 diamond lobster and patch quality is 🦞 diamond lobster.
remove rating: 🐚 platinum hermit: Current PR rating is rating: 🦞 diamond lobster, so this older rating label is no longer current.

Label justifications:

P2: This is a normal-priority provider/runtime bugfix for intermittent silent agent turns with a limited OpenAI-compatible streaming surface.
rating: 🦞 diamond lobster: Overall readiness is 🦞 diamond lobster; proof is 🦞 diamond lobster and patch quality is 🦞 diamond lobster.
status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (terminal): The PR body supplies after-fix terminal output from a local OpenClaw runtime path against a loopback OpenAI-compatible SSE endpoint, which directly exercises the changed stream assembler behavior.
proof: sufficient: Contributor real behavior proof is sufficient. The PR body supplies after-fix terminal output from a local OpenClaw runtime path against a loopback OpenAI-compatible SSE endpoint, which directly exercises the changed stream assembler behavior.

Evidence reviewed

PR surface:

Source +16, Tests +143. Total +159 across 4 files.

View PR surface stats

Area	Files	Added	Removed	Net
Source	2	16	0	+16
Tests	2	145	2	+143
Docs	0	0	0	0
Config	0	0	0	0
Generated	0	0	0	0
Other	0	0	0	0
Total	4	161	2	+159

What I checked:

Repository policy applied: Root AGENTS.md and the scoped agents guide were read; their review guidance applies because the PR touches agents/provider runtime stream assembly. (AGENTS.md:1, bb673f47b207)
Current transport finalizer still drops the reported shape: Current main maps finish_reason: stop to stop, then filters accumulated tool-call blocks unless the final stop reason is toolUse, which matches the linked bug's silent terminal-turn path. (src/agents/openai-transport-stream.ts:2693, bb673f47b207)
Current legacy provider finalizer has the same gap: The legacy openai-completions provider records a finish reason, but current main still removes tool calls whenever the final stop reason is not toolUse. (src/llm/providers/openai-completions.ts:314, bb673f47b207)
PR transport fix targets the finalization boundary: The branch records whether a real stop finish reason was seen, then promotes silent accumulated tool calls to toolUse only when there is no visible assistant text. (src/agents/openai-transport-stream.ts:2817, 9dcc70332ede)
PR legacy provider fix mirrors the invariant: The branch applies the same silent-tool-call promotion in streamOpenAICompletions, after finish and error checks and before the existing tool-call filter. (src/llm/providers/openai-completions.ts:439, 9dcc70332ede)
Focused regression tests cover both paths: The branch adds regression coverage for silent stop-finished tool calls, visible-text stop responses, and missing-finish transport behavior in the transport assembler and legacy provider tests. (src/agents/openai-transport-stream.test.ts:6987, 9dcc70332ede)

Likely related people:

shakkernerd: Available blame/history for the stream finalizers and downstream terminal classifier points to commit 23bf48e by Shakker, whose commit metadata maps to this GitHub login. (role: current implementation history owner; confidence: medium; commits: 23bf48e69eb7; files: src/agents/openai-transport-stream.ts, src/llm/providers/openai-completions.ts, src/agents/embedded-agent-runner/run/attempt-trajectory-status.ts)

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

MonkeyLeeT · 2026-05-31T22:30:12Z

@clawsweeper re-review

clawsweeper · 2026-05-31T22:30:14Z

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

State: Complete
Detail: The targeted re-review finished, the durable review comment was updated, and the synced verdict was routed.
Run: https://github.com/openclaw/clawsweeper/actions/runs/26726362652
Updated: 2026-05-31T22:36:23.939Z

steipete

One blocking issue from CI/typecheck. The implementation shape looks right, but this branch cannot merge until the new visible-text check narrows the block shape far enough for tsgo.

Finding:

src/agents/openai-transport-stream.ts:2812 uses block.text.trim() after only checking block.type === "text". In this exported transport surface, output.content is wide enough that block.text is still unknown, and CI fails in check-prod-types, check-test-types, plugin SDK boundary prep, and dependent lint/boundary lanes with TS18046: 'block.text' is of type 'unknown'. Please narrow with a string check or a small local predicate before calling trim(), for example by requiring typeof block.text === "string" in the hasVisibleText predicate.

No behavior concern with the stop/tool-call promotion itself after reading the finalizer and adjacent tests; this is a compile blocker only.

steipete

Approved after the type-narrow fix on src/agents/openai-transport-stream.ts.

Verification I ran on the fixed tree:

pnpm tsgo:prod
node scripts/run-vitest.mjs src/llm/providers/openai-completions.test.ts src/agents/openai-transport-stream.test.ts
loopback OpenAI-compatible SSE proof against the real createOpenAICompletionsTransportStreamFn: silent finish_reason: stop tool call returned toolUse; visible text plus spurious tool call stayed stop; missing finish reason dropped the unfinished tool call
.agents/skills/autoreview/scripts/autoreview --mode branch --base origin/main returned clean with no accepted/actionable findings

The remaining red CI jobs are outside this PR's changed files and match current-main/stale-base fallout, not this OpenAI-compatible stream change.

fix(openai): keep stop-finished tool calls

9dcc703

openclaw-barnacle Bot added agents Agent runtime and tooling size: S triage: mock-only-proof Candidate: PR proof only shows tests, mocks, snapshots, lint, typecheck, or CI. labels May 31, 2026

MonkeyLeeT changed the title ~~fix(openai): keep stop-finished tool calls~~ fix(agents): preserve stop-finished OpenAI tool calls May 31, 2026

openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: mock-only-proof Candidate: PR proof only shows tests, mocks, snapshots, lint, typecheck, or CI. labels May 31, 2026

MonkeyLeeT marked this pull request as ready for review May 31, 2026 22:39

clawsweeper Bot added rating: 🦞 diamond lobster Very strong PR readiness with only minor maintainer review expected. and removed rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. labels May 31, 2026

steipete self-assigned this May 31, 2026

steipete requested changes May 31, 2026

View reviewed changes

clawsweeper Bot mentioned this pull request May 31, 2026

fix(openai): honor streamed tool calls with stop finish #88799

Closed

fix(agents): narrow visible text check

12e6715

openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 31, 2026

steipete approved these changes May 31, 2026

View reviewed changes

steipete merged commit 6316648 into openclaw:main May 31, 2026
159 of 162 checks passed

steipete mentioned this pull request May 31, 2026

Bug: structured tool_calls with finish_reason stop are dropped as non_deliverable_terminal_turn #88791

Closed

MonkeyLeeT deleted the codex/fix-openai-stop-tool-calls-88791 branch June 1, 2026 02:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(agents): preserve stop-finished OpenAI tool calls#88804

fix(agents): preserve stop-finished OpenAI tool calls#88804
steipete merged 2 commits into
openclaw:mainfrom
MonkeyLeeT:codex/fix-openai-stop-tool-calls-88791

MonkeyLeeT commented May 31, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented May 31, 2026 •

edited

Loading

Uh oh!

MonkeyLeeT commented May 31, 2026

Uh oh!

clawsweeper Bot commented May 31, 2026 •

edited

Loading

Uh oh!

steipete left a comment

Uh oh!

steipete left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

MonkeyLeeT commented May 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Verification

Real behavior proof

Uh oh!

clawsweeper Bot commented May 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MonkeyLeeT commented May 31, 2026

Uh oh!

clawsweeper Bot commented May 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

steipete left a comment

Choose a reason for hiding this comment

Uh oh!

steipete left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MonkeyLeeT commented May 31, 2026 •

edited

Loading

clawsweeper Bot commented May 31, 2026 •

edited

Loading

clawsweeper Bot commented May 31, 2026 •

edited

Loading