openai-codex / gpt-5.5 repeatedly produces no first byte after #31967/#32016

## What happened

After the recent Codex timeout fixes, `openai-codex` / `gpt-5.5` still frequently stalls before emitting the first stream event.

This is not the old `context=~0 tokens` / 300s stale-timeout behavior. My local checkout already includes the merged fixes from #31967 and #32016. Hermes now detects the stall earlier via the Codex TTFB watchdog, but the underlying request still often accepts the connection and emits no stream events.

Typical user-facing output:

```text
No first byte from provider in 45s (codex stream, model: gpt-5.5). Reconnecting.
⏳ Retrying in 2.9s (attempt 1/3)...
⏳ Still working... (2 min elapsed — iteration 8/9999, receiving stream response)
⚠️ No first byte from provider in 45s (codex stream, model: gpt-5.5). Reconnecting.
⏳ Retrying in 2.3s (attempt 1/3)...
⏳ Retrying in 2.1s (attempt 1/3)...
⚠️ No first byte from provider in 45s (codex stream, model: gpt-5.5). Reconnecting.
```

In one scheduled gateway run, the same failure family progressed like this:

```text
Codex stream produced no bytes within TTFB cutoff (45s > 45s, model=gpt-5.5)
Non-streaming API call stale for 90s (threshold 90s). model=gpt-5.5 context=~45,779 tokens
Non-streaming API call stale for 90s (threshold 90s). model=gpt-5.5 context=~45,779 tokens
Non-streaming API call stale for 90s (threshold 90s). model=gpt-5.5 context=~45,779 tokens
API call failed after 3 retries. [Errno 32] Broken pipe
Scheduled job failed: RuntimeError: [Errno 32] Broken pipe
```

## Expected behavior

`openai-codex` / `gpt-5.5` should either:

- emit a first stream event normally,
- return a structured error if the backend rejects the request, or
- fail in a way that allows Hermes to cleanly trigger fallback/retry without repeated first-byte stalls.

## Actual behavior

The backend frequently accepts the request but emits no stream events for 45s. Hermes kills the request and retries, but the same first-byte stall can repeat multiple times in a single turn.

When the retry eventually reaches the stale-call detector, the request can end with `ReadError: [Errno 32] Broken pipe` after Hermes closes the stalled connection.

## Environment

- Hermes checkout: local `main`
- Local checkout includes:
  - #31967 (`fix(codex): size and propagate timeouts for Responses-API requests; lower stale defaults`)
  - #32016 (`fix(codex): surface actionable hint when stale-call detector fires on known silent-reject pattern`)
- Platform: macOS, launchd gateway
- Provider: `openai-codex`
- Model: `gpt-5.5`
- Base URL: `https://chatgpt.com/backend-api/codex`
- Affected surfaces: gateway sessions and scheduled agent runs
- `fallback_providers`: none on the affected profile(s)

## Why this seems distinct from the fixed issue

#31967 fixed the timeout accounting bug and lowered the default stale timeout from 300s to 90s. That fix appears to be active: logs now show realistic context estimates such as `context=~45,779 tokens`, not `context=~0 tokens`.

#32016 also appears active: the checkout contains the gpt-5.5 Codex silent-hang hint path.

The remaining issue is earlier in the request lifecycle: the Codex Responses stream often produces no first byte at all, so the 45s TTFB watchdog fires repeatedly.

## Related issues / PRs

- Related to #21444
- Related to #32370
- #31967 is already present locally
- #32016 is already present locally
- Possibly related to #29740, which proposes sanitizing `reasoning`, `include`, and `store` for `gpt-5.5` on the Codex backend

## Notes

This may still be backend-side, but it is severe enough in gateway/scheduled usage that the current retry behavior produces repeated stalls and occasional job failure. A clean structured backend rejection, a payload sanitizer, or a more reliable fallback trigger would make this much easier to operate.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

openai-codex / gpt-5.5 repeatedly produces no first byte after #31967/#32016 #32373

What happened

Expected behavior

Actual behavior

Environment

Why this seems distinct from the fixed issue

Related issues / PRs

Notes

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

openai-codex / gpt-5.5 repeatedly produces no first byte after #31967/#32016 #32373

Description

What happened

Expected behavior

Actual behavior

Environment

Why this seems distinct from the fixed issue

Related issues / PRs

Notes

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions