[Bug]: OpenAI Responses SSE sanitizer appears to starve embedded streams; bypass restores normal output

## Summary

This looks like a more specific root-cause report for the broader `openai/* embedded run hangs until timeout` symptom reported in #76174 (and possibly related to #75824).

On OpenClaw `2026.4.29 (a448042)`, embedded OpenAI Responses streaming can stall because the SSE sanitization layer appears to starve or corrupt the stream before the OpenAI SDK parser sees the downstream events.

A clean repro on macOS succeeds immediately when `sanitizeOpenAISdkSseResponse()` is bypassed for one run, which strongly isolates the timeout to that sanitizer path rather than general OpenAI connectivity, auth, or the embedded runner itself.

## Environment

- OpenClaw `2026.4.29 (a448042)`
- macOS arm64
- Node `24.13.1`
- Provider: OpenAI direct API key path
- Models observed failing in embedded runs before the bypass: `openai/gpt-5.4`, `openai/gpt-5.3-codex`, `openai/gpt-4o-mini`

## What we observed before the bypass

For embedded runs using the OpenAI Responses path:

- request is dispatched successfully
- OpenAI returns `200 OK`
- content type is `text/event-stream; charset=utf-8`
- response body is present
- the embedded runner reaches `stream-ready`
- a first event of type `start` is observed
- then no usable downstream reply events arrive and the run eventually hits the idle watchdog / embedded timeout

In other words, this is **not** a simple `fetch never returned` failure.

## Critical isolation result

I patched the installed bundle temporarily so `sanitizeOpenAISdkSseResponse()` can be bypassed with an env var for a single run:

```bash
OPENCLAW_BYPASS_OPENAI_SDK_SSE_SANITIZE=1
```

Then I ran a fresh unique-session repro:

```bash
OPENCLAW_DISABLE_BONJOUR=1 \
OPENCLAW_BYPASS_OPENAI_SDK_SSE_SANITIZE=1 \
openclaw agent --local --agent main \
  --session-id diag-bypass-sse-1777763671 \
  --model openai/gpt-4o-mini \
  --message "Reply with exactly: hello" \
  --json --timeout 90
```

That run completed successfully and returned:

```text
hello
```

Total duration was ~30s, no fallback used.

## Why this points at the sanitizer

With the bypass enabled, the downstream SSE/event flow looks normal again. The run emits/observes events like:

- `response.created`
- `response.in_progress`
- `response.output_item.added`
- `response.content_part.added`
- `response.output_text.delta`
- `response.output_text.done`
- `response.completed`

Before the bypass, the stream reached the initial startup boundary but stalled before those downstream messages materialized.

## Relevant diagnostic evidence

Before bypass:

```text
[diag:fetch-sse] guarded-fetch-response provider=openai model=gpt-4o-mini status=200 contentType=text/event-stream; charset=utf-8 bodyPresent=true url=https://api.openai.com/v1/responses
[diag:fetch-sse] sanitize-start status=200 contentType=text/event-stream; charset=utf-8 url=https://api.openai.com/v1/responses
[diag:fetch-sse] sanitize-pull index=1 done=false bytes=492
[diag:openai-sdk] parse-response-stream status=200 contentType=text/event-stream; charset=utf-8 bodyPresent=true
[diag:embedded-stream] first-event provider=openai model=gpt-4o-mini api=openai-responses elapsedMs=1642 type=start
[diag:fetch-sse] sanitize-pull index=2 done=false bytes=1369
...
embedded run timeout
```

After bypass:

```text
[diag:fetch-sse] sanitize-bypassed status=200 contentType=text/event-stream; charset=utf-8 url=https://api.openai.com/v1/responses
[diag:openai-sdk] sse-message index=5 event=response.output_text.delta dataPrefix={"type":"response.output_text.delta",..."delta":"hello"...}
[openai-transport] [diag:openai-responses] text-delta provider=openai model=gpt-4o-mini api=openai-responses contentIndex=0 deltaLen=5 totalLen=5
[openai-transport] [diag:openai-responses] response-completed provider=openai model=gpt-4o-mini api=openai-responses status=completed stopReason=stop contentBlocks=1
```

## Suspect area

The problematic layer appears to be the SSE sanitation/filtering path around the function:

- `sanitizeOpenAISdkSseResponse(response)`

That layer sits between the raw `text/event-stream` HTTP response and the OpenAI SDK SSE parser.

## Expected behavior

OpenClaw should preserve valid OpenAI Responses SSE frames so embedded OpenAI runs can stream normally without requiring a bypass.

## Actual behavior

The sanitizer path appears to interfere with or starve the SSE stream, causing embedded runs to hang until timeout.

## Workaround

A temporary bundle-local env-gated bypass of the sanitizer restored correct behavior immediately:

```bash
OPENCLAW_BYPASS_OPENAI_SDK_SSE_SANITIZE=1
```

## Request

Could maintainers take a look at the SSE sanitizer path for the OpenAI Responses stream on `2026.4.29`?

This seems like a concrete root-cause lead for the broader `openai/* embedded runs hang until timeout` reports such as #76174.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: OpenAI Responses SSE sanitizer appears to starve embedded streams; bypass restores normal output #76305

Summary

Environment

What we observed before the bypass

Critical isolation result

Why this points at the sanitizer

Relevant diagnostic evidence

Suspect area

Expected behavior

Actual behavior

Workaround

Request

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: OpenAI Responses SSE sanitizer appears to starve embedded streams; bypass restores normal output #76305

Description

Summary

Environment

What we observed before the bypass

Critical isolation result

Why this points at the sanitizer

Relevant diagnostic evidence

Suspect area

Expected behavior

Actual behavior

Workaround

Request

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions