Skip to content

Codex app-server: evaluate post-tool raw assistant completion semantics #84137

@rozmiarD

Description

@rozmiarD

Context

While preparing #84135, we verified that Codex app-server currently treats two raw assistant completion cases differently:

  • Pre-tool raw assistant completion without turn/completed can represent final assistant output. The current short assistant idle release is useful here because it delivers captured assistant text instead of waiting for a terminal idle timeout.
  • Post-tool raw assistant completion or progress after a dynamic/native tool handoff can represent synthesis or a terminal-event wait rather than a final answer. Today that path arms the completion-idle guard with turnAssistantCompletionIdleTimeoutMs, which defaults to 10 seconds.

Conservative fix in #84135

#84135 takes the low-risk path:

  • Adds appServer.postToolRawAssistantCompletionIdleTimeoutMs.
  • Uses it only for the post-tool raw assistant completion guard.
  • Falls back to turnAssistantCompletionIdleTimeoutMs when unset.
  • Preserves the current default behavior and the pre-tool raw assistant release semantics.

That makes the timeout configurable for heavy/trusted workloads without changing existing fail-fast behavior.

Alternative considered

The more complete long-term fix may be to change the semantics, not only expose a config knob.

One possible direction:

  • Treat post-tool raw assistant completion/progress as a distinct post-tool synthesis or terminal-wait state.
  • Give that state its own default budget, for example 60s, instead of inheriting the 10s assistant release budget.
  • Keep pre-tool raw assistant release on the short assistant idle budget.
  • Update the existing tests that currently encode the 10s post-tool fail-fast behavior.

This may better match real heavy local/tool workloads, but it is a behavior change. It could delay stuck-turn detection and needs maintainer agreement on the intended default before changing shipped semantics.

Decision needed

Should post-tool raw assistant completion keep the current fail-fast default unless operators opt in via config, or should OpenClaw change the default semantics so post-tool synthesis waits longer by default?

Suggested follow-up proof

A future semantic-change PR should include:

  • A regression test showing pre-tool raw assistant completion still releases captured final text quickly.
  • A regression test showing post-tool raw assistant completion uses the new default budget.
  • Evidence from a real or close-to-real Codex app-server workload where post-tool synthesis legitimately stays quiet longer than the assistant release budget.
  • A clear before/after note for stuck-turn detection latency.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions