[Bug]: 2026.4.29 — all openai/* embedded runs hang with zero tokens until timeout; direct curl to OpenAI works in 2s

## Summary

On OpenClaw 2026.4.29, **every `openai/*` model embedded run hangs for the full timeout window with zero tokens emitted** and ends with `FailoverError: LLM request timed out.` Direct `curl` from the same host to `api.openai.com` using the same API key returns a real response in ~1–2 seconds, so the OpenAI Platform itself is healthy and reachable. The gateway-internal OpenAI call never produces output.

This affects `openai/gpt-5.5`, `openai/gpt-5.4`, and `openai/gpt-5.4-nano` (the last triggered by the `active-memory` plugin). Other providers (`anthropic/claude-opus-4-7`, `google/gemini-3-flash`) work fine in the same gateway. Only `openai/*` (direct API-key route, not codex OAuth) is broken.

## Environment

- OpenClaw `2026.4.29 (a448042)`
- macOS 26.4.1 arm64 (Mac Studio M3 Ultra)
- Node `v25.9.0`
- Auth: direct OpenAI Platform API key (`OPENAI_API_KEY` env), no Codex OAuth
- Provider config: `models.providers.openai.timeoutSeconds: 600`, `baseUrl: https://api.openai.com/v1`

## Reproduction

1. Configure `agents.defaults.model.primary` (or any sub-agent model) to `openai/gpt-5.5` (also tested `openai/gpt-5.4`).
2. Trigger any embedded run (sub-agent spawn, `/model gpt5.5` from chat, or active-memory pre-reply).
3. Run hangs through full timeout window. Zero input/output tokens. No HTTP response observed.

Tried with and without:
- `agents.defaults.models["openai/gpt-5.5"].streaming: false`
- `agents.defaults.timeoutSeconds: 900`
- `plugins.entries.active-memory.enabled: false`

None of these affect the outcome.

## Proof OpenAI itself is healthy from the same host

```
$ curl -sS -w "HTTP: %{http_code}, time: %{time_total}s\n" \
    https://api.openai.com/v1/chat/completions \
    -H "Authorization: Bearer $OPENAI_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{"model":"gpt-5.5","messages":[{"role":"user","content":"hi"}]}'

{ ... valid response with content ... }
HTTP: 200, time: 2.161660s
```

DNS/network confirmed clean: `api.openai.com` resolves to `172.66.0.243 / 162.159.140.245`, IPv4 + IPv6 reachable in < 100 ms, no proxy on user shell or gateway process, `openclaw doctor` clean (0 plugin errors, no warnings about OpenAI auth, runtime, or model config).

## Gateway log evidence

Embedded run trace shows prep stages all complete normally and `stream-ready` fires, then nothing comes back from OpenAI until the embedded-run timeout aborts:

```
[trace:embedded-run] prep stages: runId=<id> phase=stream-ready totalMs=12892
  stages=workspace-sandbox:1ms, skills:0ms, core-plugin-tools:5216ms,
  bootstrap-context:21ms, bundle-tools:1452ms, system-prompt:2476ms,
  session-resource-loader:681ms, agent-session:1ms, stream-setup:3044ms

[agent/embedded] embedded run timeout: runId=<id> timeoutMs=120000
[agent/embedded] embedded run failover decision: stage=assistant decision=fallback_model reason=timeout from=openai/gpt-5.4 profile=-
[model-fallback/decision] decision=candidate_failed requested=openai/gpt-5.4 candidate=openai/gpt-5.4 reason=timeout next=none detail=LLM request timed out.
```

This pattern repeats across `openai/gpt-5.5`, `openai/gpt-5.4`, and `openai/gpt-5.4-nano`.

The active-memory plugin also fails in the same shape (its own internal `gpt-5.4-nano` calls time out), but disabling active-memory does NOT fix the user-visible embedded runs — the assistant-stage OpenAI call hangs independently.

Stats reported by the runtime for hung sub-agent runs: `runtime 1m10s–6m46s • tokens 0 (in 0 / out 0)` — i.e. no bytes ever come back from the OpenAI request before the watchdog fires.

## What I'm asking for

- Any pointer to a known issue / fix on this OpenClaw build for the OpenAI direct-API-key path
- Or a diagnostics flag / DEBUG knob that would log whether the OpenAI HTTP request is actually being dispatched (and what response, if any, the SDK is receiving) — so we can localize whether this is in the OpenAI SDK, an undici/fetch pool issue, or further upstream

Happy to grab additional logs with whatever flag you suggest.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: 2026.4.29 — all openai/* embedded runs hang with zero tokens until timeout; direct curl to OpenAI works in 2s #76174

Summary

Environment

Reproduction

Proof OpenAI itself is healthy from the same host

Gateway log evidence

What I'm asking for

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: 2026.4.29 — all openai/* embedded runs hang with zero tokens until timeout; direct curl to OpenAI works in 2s #76174

Description

Summary

Environment

Reproduction

Proof OpenAI itself is healthy from the same host

Gateway log evidence

What I'm asking for

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions