Skip to content

[Bug]: OpenClaw-native provider failure leaves web chat session stuck in progress #91730

@nikhilmaddirala

Description

@nikhilmaddirala

Bug type

Crash (process/app exits or hangs)

Beta release blocker

No

Summary

OpenClaw 2026.6.1 web chat can remain stuck in running / “In progress” after an OpenClaw-native openai/gpt-5.4-mini turn hits a provider transport failure, instead of ending the turn and surfacing an error.

Steps to reproduce

  1. Run OpenClaw 2026.6.1 gateway on Linux with a webchat session using openai/gpt-5.4-mini.
  2. Configure the model for the OpenClaw-native runtime with agentRuntime.id = "openclaw".
  3. In the web UI, send a message that causes tool use and a follow-up model call.
  4. Observe the provider request fail with LLM request failed: network connection error.
  5. Observe the web UI remains “In progress” and the session registry remains status: "running".

Expected behavior

When the provider request fails, the turn should end cleanly: the UI should stop showing “In progress”, the session should not remain running, and the user should receive a visible error or failed-turn state.

Actual behavior

The web UI remains stuck “In progress”. The session registry remains status: "running" even though gateway logs show the embedded run failed before reply.

Observed direct web session:

  • Session key: agent:main:main
  • Session ID: 333bf549-d458-48d1-b3cc-32d9e4cd98b1
  • Status after failure: running
  • Runtime: OpenClaw Default
  • Model: openai/gpt-5.4-mini
  • Transcript tail includes an empty assistant message [] after prior tool activity.

OpenClaw version

2026.6.1 (unknown)

Operating system

Linux 6.18.26

Install method

Nix package, running gateway as user service:
openclaw gateway run --port 18789 --tailscale off

Model

openai/gpt-5.4-mini

Provider / routing chain

OpenClaw webchat -> OpenClaw native runtime -> openai-chatgpt-responses -> openai/gpt-5.4-mini

Additional provider/model setup details

Relevant effective setup:

  • agents.defaults.models."openai/gpt-5.4-mini".agentRuntime.id = "openclaw"
  • authProfileOverride = "openai:nikhil.maddirala@gmail.com"
  • Codex plugin was enabled globally, but the affected session reported Runtime: OpenClaw Default, not Codex.

Logs, screenshots, and evidence

Inline evidence only. I am not attaching the trajectory bundles initially because they may contain workspace/session context.

Gateway log excerpt:

2026-06-09T17:29:54.524+02:00 [responses] error provider=openai api=openai-chatgpt-responses model=gpt-5.4-mini name=Error status=undefined code=undefined type=undefined causeName=ProviderHttpError causeCode=invalid_provider_content_type message=Connection error.
2026-06-09T17:29:54.590+02:00 embedded run agent end
2026-06-09T17:29:54.843+02:00 auth profile failure state updated
2026-06-09T17:29:54.846+02:00 embedded run failover decision
2026-06-09T17:29:54.848+02:00 lane task error: lane=main durationMs=7072 error="FailoverError: LLM request failed: network connection error."
2026-06-09T17:29:54.850+02:00 lane task error: lane=session:agent:main:main durationMs=7075 error="FailoverError: LLM request failed: network connection error."
2026-06-09T17:29:54.867+02:00 Embedded agent failed before reply: LLM request failed: network connection error.
2026-06-09T17:29:58.178+02:00 provider auth state re-warmed (auth-profile-failure) in 2332ms eventLoopMax=17.9ms

Session state after failure:

session key: agent:main:main
session id: 333bf549-d458-48d1-b3cc-32d9e4cd98b1
status: running
runtime: OpenClaw Default
model: openai/gpt-5.4-mini
UI state: still shows "In progress"

A redacted trajectory export was generated locally for private debugging, but I am not attaching it because it may contain workspace/session context. I can extract narrower fields if maintainers request specific evidence.

Impact and severity

Affected: OpenClaw webchat users using OpenClaw-native runtime with openai/gpt-5.4-mini.

Severity: High. The UI remains stuck in an active/in-progress state and the session registry remains running, which blocks clear recovery and makes it unclear whether the agent is still working.

Frequency: Observed at least once in direct webchat after a provider transport failure; a related Discord session also became stuck during the same debugging session.

Consequence: Users see no final response or actionable error, and later messages may appear blocked by the stuck lane/session.

Additional information

A separate earlier Codex-harness issue produced missing tool-result errors, but this report is about the OpenClaw-native runtime path. The affected direct web session reported Runtime: OpenClaw Default.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High-priority user-facing bug, regression, or broken workflow.clawsweeper:needs-live-reproClawSweeper needs live local, crabbox, or manual validation to confirm this issue.impact:message-lossChannel message delivery can be lost, duplicated, or misrouted.impact:session-stateSession, memory, transcript, context, or agent state can drift or corrupt.issue-rating: 🐚 platinum hermitGood issue quality with a plausible reproduction path needing some confirmation.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions