[Bug]: HTTP 429 from GitHub Copilot is still misclassified as idle timeout on 2026.4.27

### Bug type

Regression (or incomplete fix after #71120)

### Beta release blocker

No

### Summary

GitHub Copilot `/responses` returns HTTP 429 for an exhausted weekly quota in under 1 second, but OpenClaw `2026.4.27` still does not surface the provider HTTP error promptly.

Instead, the model stream produces no assistant tokens and is eventually terminated by the LLM idle watchdog after the full default idle timeout:

```text
LLM idle timeout (120s): no response from model
```

The run is tagged as `timedOut: true` and `idleTimedOut: true`, then the harness automatically retries the same prompt once. The second attempt also waits 120 seconds. End-to-end, the user sees roughly 4 minutes of latency for an upstream HTTP 429 that is available immediately.

This appears related to the previously fixed #71120, but the current symptom is different:

- #71120: 429 caused a 10-minute silent hang on `2026.4.22`.
- This issue: 429 is misclassified as `idleTimedOut` twice on `2026.4.27`.

### Steps to reproduce

1. Use OpenClaw `2026.4.27` with GitHub Copilot as the provider and `gpt-5.4` through the `openai-responses` API.
2. Exhaust the Copilot weekly quota so that `/responses` returns HTTP 429.
3. Send a simple inbound channel message to an agent using `github-copilot/gpt-5.4`.
4. Observe the trajectory events and user-visible response timing.

### Expected behavior

- OpenClaw should detect the provider HTTP 429 within a few seconds.
- The trajectory should preserve the HTTP status and provider error body, for example as `model.failed` or an equivalent provider/protocol error.
- The run should not be classified as `idleTimedOut`.
- 4xx provider errors, especially 429 quota exhaustion, should not be auto-retried as if the model produced no response.

### Actual behavior

A single inbound Feishu message (`om_x100b500b746598acb10a6bfe6f48cad`, text `你好`) produced two 120-second idle timeouts:

| Time (UTC) | Event |
|---|---|
| `00:28:27` | Message arrives |
| `00:28:34.653` | `prompt.submitted` for session 1 |
| `00:30:34.663` | `model.completed`, exactly 120.010s later |
| `00:30:41.152` | `prompt.submitted` for session 2, same run retried |
| `00:32:41.159` | `model.completed`, exactly 120.007s later |

Both `model.completed` events include:

```json
{
  "aborted": true,
  "externalAbort": false,
  "timedOut": true,
  "idleTimedOut": true,
  "timedOutDuringCompaction": false,
  "promptError": "LLM idle timeout (120s): no response from model | LLM idle timeout (120s): no response from model",
  "promptErrorSource": "prompt",
  "assistantTexts": [],
  "provider": "github-copilot",
  "modelId": "gpt-5.4",
  "modelApi": "openai-responses"
}
```

The duplicated `promptError` string suggests the idle-timeout error may also be concatenated across an internal retry or bookkeeping layer.

### OpenClaw version

`2026.4.27`

### Operating system

Ubuntu / Linux

### Install method

npm/global install

### Model

`github-copilot/gpt-5.4`

### Provider / routing chain

`github-copilot` -> `gpt-5.4` (`openai-responses`)

### Additional provider/model setup details

The Copilot token itself was healthy:

- `GET /models` returned HTTP 200 with 37 models in about 1 second.
- The token had just been refreshed and had about 28 minutes until expiry.

### Logs, screenshots, and evidence

Direct reproduction using the same cached Copilot token against the same endpoint the adapter uses:

```bash
curl -sS -N -o /tmp/out -w "HTTP %{http_code} total=%{time_total}s\n" \
  --max-time 15 \
  -H "Authorization: Bearer $TOK" \
  -H "Accept: text/event-stream" \
  -H "Content-Type: application/json" \
  -H "Copilot-Integration-Id: vscode-chat" \
  -H "Editor-Version: vscode/1.99.0" \
  -H "OpenAI-Intent: conversation-panel" \
  https://api.githubcopilot.com/responses \
  --data '{"model":"gpt-5.4","input":"ping","stream":true}'
```

Observed:

```text
HTTP 429 total=0.968s
Sorry, you've exceeded your weekly rate limit. Please review our [Terms of Service](https://docs.github.com/en/site-policy/github-terms/github-terms-of-service).
```

For comparison, non-streaming chat completions also returns immediately:

```bash
curl -sS -o /tmp/out -w "HTTP %{http_code} total=%{time_total}s\n" \
  --max-time 130 \
  -H "Authorization: Bearer $TOK" \
  -H "Content-Type: application/json" \
  -H "Copilot-Integration-Id: vscode-chat" \
  -H "Editor-Version: vscode/1.99.0" \
  -H "OpenAI-Intent: conversation-panel" \
  https://api.githubcopilot.com/chat/completions \
  --data '{"model":"gpt-5.4","messages":[{"role":"user","content":"ping"}],"stream":false}'
```

Observed:

```text
HTTP 429 total=0.856s
Sorry, you've exceeded your weekly rate limit. Please review our [Terms of Service](https://docs.github.com/en/site-policy/github-terms/github-terms-of-service).
```

The provider response is plain text, not SSE.

### Impact and severity

- An immediate upstream provider error is reported as `timedOut` + `idleTimedOut`, which is misleading during debugging.
- User-visible latency becomes two full idle-timeout windows, roughly 4 minutes by default, instead of about 1 second.
- Retrying an exhausted-quota 429 is counterproductive and may worsen rate-limit pressure.
- The duplicated `promptError` makes the trajectory harder to interpret.

### Additional information

Installed bundle pointers from the affected environment:

- `dist/selection-8xKkwZC_.js`: `streamWithIdleTimeout`, `resolveLlmIdleTimeoutMs`, `DEFAULT_LLM_IDLE_TIMEOUT_MS = 120_000`.
- `dist/plugin-sdk/src/agents/pi-embedded-runner/run/llm-idle-timeout.d.ts`: idle-timeout declarations.
- `dist/extensions/github-copilot/stream.js` and `dist/stream-B6dB0lb1.js`: Copilot provider wrapper.

My reading of the failure mode is:

1. The provider returns HTTP 429 with a non-SSE plain-text body.
2. The streaming layer returns an async stream object, but its iterator neither yields nor throws the provider HTTP error.
3. `streamWithIdleTimeout` has no HTTP status visibility, so it waits 120 seconds and throws the generic idle-timeout error.
4. The retry layer treats this as a timeout and retries the same exhausted-quota call.

The minimum expected fix is for non-2xx provider responses to be converted into a typed provider/protocol error before the stream reaches the idle watchdog. HTTP status and body should be preserved in the trajectory, and 4xx responses should not be retried as idle timeouts.


Time (UTC)	Event
`00:28:27`	Message arrives
`00:28:34.653`	`prompt.submitted` for session 1
`00:30:34.663`	`model.completed`, exactly 120.010s later
`00:30:41.152`	`prompt.submitted` for session 2, same run retried
`00:32:41.159`	`model.completed`, exactly 120.007s later

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: HTTP 429 from GitHub Copilot is still misclassified as idle timeout on 2026.4.27 #75332

Bug type

Beta release blocker

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Model

Provider / routing chain

Additional provider/model setup details

Logs, screenshots, and evidence

Impact and severity

Additional information

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: HTTP 429 from GitHub Copilot is still misclassified as idle timeout on 2026.4.27 #75332

Description

Bug type

Beta release blocker

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Model

Provider / routing chain

Additional provider/model setup details

Logs, screenshots, and evidence

Impact and severity

Additional information

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions