Skip to content

fix(cli): accept empty Claude end turns#89029

Closed
charles-openclaw wants to merge 1 commit into
openclaw:mainfrom
charles-openclaw:fix-claude-cli-empty-end-turn
Closed

fix(cli): accept empty Claude end turns#89029
charles-openclaw wants to merge 1 commit into
openclaw:mainfrom
charles-openclaw:fix-claude-cli-empty-end-turn

Conversation

@charles-openclaw

Copy link
Copy Markdown
Contributor

Summary

  • Fixes Claude CLI JSONL turns that complete successfully with an empty result from being converted into empty_response failover errors.
  • Carries the parser's structured empty-result signal through runPreparedCliAgent, so clean no-reply turns finish without invoking a fallback model.
  • Keeps ordinary unstructured empty CLI stdout as an error.

Linked context

Closes #89008.

Real behavior proof (required for external PRs)

  • Behavior or issue addressed: successful Claude CLI result: "" / thinking-only end_turn turns triggered empty_response and model fallback.
  • Real environment tested: local contributor checkout with mocked CLI supervisor output for the Claude CLI JSONL result shape from the issue.
  • Exact steps or command run after this patch: PNPM_CONFIG_OFFLINE=true corepack pnpm test src/agents/cli-runner.reliability.test.ts -- --reporter=dot
  • Evidence after fix: the focused shard passed 1 Vitest file / 37 tests, including the new empty Claude result regression.
  • Observed result after fix: runPreparedCliAgent returns a successful no-payload result with fallbackUsed: false and does not emit an llm_output hook for the empty assistant text.
  • What was not tested: a live Claude CLI thinking-only turn against a real local Claude session.
  • Proof limitations or environment constraints: this contributor runner cannot honestly force the production Claude CLI subscription/runtime into the intermittent thinking-only completion described in the issue without private local session state; the regression uses the structured JSONL result shape already parsed by OpenClaw.

Tests and validation

  • corepack pnpm exec oxfmt --write --threads=1 src/agents/cli-output.ts src/agents/cli-runner.ts src/agents/cli-runner.reliability.test.ts
  • PNPM_CONFIG_OFFLINE=true corepack pnpm test src/agents/cli-runner.reliability.test.ts -- --reporter=dot
  • git diff --check -- src/agents/cli-output.ts src/agents/cli-runner.ts src/agents/cli-runner.reliability.test.ts

Risk checklist

  • Did user-visible behavior change? Yes, clean empty Claude CLI completions no longer trigger fallback banners or wrong-model replies.
  • Did config, environment, or migration behavior change? No.
  • Did security, auth, secrets, network, or tool execution behavior change? No.
  • Highest-risk area: accidentally accepting genuinely broken empty CLI output.
  • Mitigation: only Claude CLI structured empty JSONL result outputs set allowEmptySuccess; the existing unstructured empty stdout regression still expects empty_response.

Current review state

Ready for review. Waiting on CI and maintainer review; live Claude CLI intermittent reproduction was not run here.

@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: S triage: mock-only-proof Candidate: PR proof only shows tests, mocks, snapshots, lint, typecheck, or CI. labels Jun 1, 2026
@clawsweeper

clawsweeper Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Thanks for the context here. I swept through the related work, and this is now duplicate or superseded.

Close as superseded: the linked wrong-model fallback bug is already fixed on current main and shipped, while the remaining parser-level Claude empty-result work is covered by a newer proof-positive PR with a safer boundary.

Canonical path: Keep the shipped silent-empty policy fix and use #90450 as the canonical parser-level review path for Claude empty-result streaming behavior.

So I’m closing this here and keeping the remaining discussion on #90450 and #88946.

Review details

Best possible solution:

Keep the shipped silent-empty policy fix and use #90450 as the canonical parser-level review path for Claude empty-result streaming behavior.

Do we have a high-confidence way to reproduce the issue?

No current-main reproduction remains for the linked fallback bug: current main and v2026.6.6 route empty CLI output through allowEmptyAssistantReplyAsSilent. The historical source path was valid, and the remaining parser edge is covered by the newer open PR with production logs.

Is this the best way to solve the issue?

No. The current main policy fix plus the newer parser-level PR is a safer solution than this branch's obsolete parser-owned allowEmptySuccess bypass.

Security review:

Security review cleared: The diff is limited to CLI parser/runner/test behavior and does not add dependencies, workflows, package-resolution changes, secrets handling, or new code-execution surfaces.

AGENTS.md: found and applied where relevant.

What I checked:

Likely related people:

  • steipete: Merged the broad PR that shipped the current silent-empty CLI runner policy and has substantial history on the CLI runner surface. (role: feature-history owner; confidence: high; commits: 9ead0ae9219e; files: src/agents/cli-runner.ts, src/agents/cli-runner/types.ts, src/agents/cli-runner.reliability.test.ts)
  • vincentkoc: Committed the current-main follow-up that forwards the silent-empty policy through CLI fallback runs and appears in recent blame/history around the current runner path. (role: recent area committer; confidence: medium; commits: 57633c42b647; files: src/auto-reply/reply/agent-runner-execution.ts, src/auto-reply/reply/agent-runner-execution.test.ts, src/agents/cli-runner.ts)
  • TurboTheTurtle: Authored the follow-up silent fallback policy commit that was cherry-picked into current main. (role: recent adjacent fix author; confidence: medium; commits: 57633c42b647; files: src/auto-reply/reply/agent-runner-execution.ts, src/auto-reply/reply/agent-runner-execution.test.ts)

Codex review notes: model internal, reasoning high; reviewed against 231b5a14d5d7; fix evidence: release v2026.6.6, commit 9ead0ae9219e.

@clawsweeper clawsweeper Bot added rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P2 Normal backlog priority with limited blast radius. merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. merge-risk: 🚨 auth-provider 🚨 May break OAuth, tokens, provider routing, model choice, or credentials. labels Jun 1, 2026
@charles-openclaw

Copy link
Copy Markdown
Contributor Author

Thanks for the review. I cannot honestly provide live Claude CLI thinking-only/end_turn proof from this contributor environment without relying on private local Claude session state or forcing an intermittent production runtime shape I do not control.

What I can provide here is the focused regression proof from the PR body: the parser now preserves the structured empty-result signal from the Claude JSONL result shape, runPreparedCliAgent treats that signal as a successful no-payload turn with fallbackUsed: false, and the focused reliability shard passed with 37 tests.

So the remaining live proof item should be handled in a maintainer environment that can safely reproduce the real Claude CLI session behavior. I am leaving the patch scoped to the shared parser/runner path covered by the regression test rather than claiming live proof I do not have.

@clawsweeper clawsweeper Bot added status: 🛠️ actively grinding The PR author has acted after the latest ClawSweeper review and work remains. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. and removed status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 🛠️ actively grinding The PR author has acted after the latest ClawSweeper review and work remains. labels Jun 14, 2026
@clawsweeper

clawsweeper Bot commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

ClawSweeper applied the proposed close for this PR.

  • Action: closed this PR.
  • Close reason: duplicate or superseded.
  • Evidence: durable ClawSweeper review.
  • Coverage proof: PR B clearly carries PR A's useful bugfix intent in the safer canonical form that is already merged and shipped. PR A's parser-specific implementation details are not material independent work because the report identifies them as an obsolete bypass, and remaining parser-edge discussion belongs on a newer canonical PR rather than keeping PR A open. Covering PR: Fix live model inference edge cases #88946.

@clawsweeper clawsweeper Bot closed this Jun 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling merge-risk: 🚨 auth-provider 🚨 May break OAuth, tokens, provider routing, model choice, or credentials. merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. P2 Normal backlog priority with limited blast radius. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. size: S status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. triage: mock-only-proof Candidate: PR proof only shows tests, mocks, snapshots, lint, typecheck, or CI.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

claude-cli thinking-only (end_turn, empty text) turns trigger empty_response model-fallback re-run on a different model

1 participant