Skip to content

[Bug]: HTTP 429 from GitHub Copilot is still misclassified as idle timeout on 2026.4.27 #75332

@zzl360

Description

@zzl360

Bug type

Regression (or incomplete fix after #71120)

Beta release blocker

No

Summary

GitHub Copilot /responses returns HTTP 429 for an exhausted weekly quota in under 1 second, but OpenClaw 2026.4.27 still does not surface the provider HTTP error promptly.

Instead, the model stream produces no assistant tokens and is eventually terminated by the LLM idle watchdog after the full default idle timeout:

LLM idle timeout (120s): no response from model

The run is tagged as timedOut: true and idleTimedOut: true, then the harness automatically retries the same prompt once. The second attempt also waits 120 seconds. End-to-end, the user sees roughly 4 minutes of latency for an upstream HTTP 429 that is available immediately.

This appears related to the previously fixed #71120, but the current symptom is different:

Steps to reproduce

  1. Use OpenClaw 2026.4.27 with GitHub Copilot as the provider and gpt-5.4 through the openai-responses API.
  2. Exhaust the Copilot weekly quota so that /responses returns HTTP 429.
  3. Send a simple inbound channel message to an agent using github-copilot/gpt-5.4.
  4. Observe the trajectory events and user-visible response timing.

Expected behavior

  • OpenClaw should detect the provider HTTP 429 within a few seconds.
  • The trajectory should preserve the HTTP status and provider error body, for example as model.failed or an equivalent provider/protocol error.
  • The run should not be classified as idleTimedOut.
  • 4xx provider errors, especially 429 quota exhaustion, should not be auto-retried as if the model produced no response.

Actual behavior

A single inbound Feishu message (om_x100b500b746598acb10a6bfe6f48cad, text 你好) produced two 120-second idle timeouts:

Time (UTC) Event
00:28:27 Message arrives
00:28:34.653 prompt.submitted for session 1
00:30:34.663 model.completed, exactly 120.010s later
00:30:41.152 prompt.submitted for session 2, same run retried
00:32:41.159 model.completed, exactly 120.007s later

Both model.completed events include:

{
  "aborted": true,
  "externalAbort": false,
  "timedOut": true,
  "idleTimedOut": true,
  "timedOutDuringCompaction": false,
  "promptError": "LLM idle timeout (120s): no response from model | LLM idle timeout (120s): no response from model",
  "promptErrorSource": "prompt",
  "assistantTexts": [],
  "provider": "github-copilot",
  "modelId": "gpt-5.4",
  "modelApi": "openai-responses"
}

The duplicated promptError string suggests the idle-timeout error may also be concatenated across an internal retry or bookkeeping layer.

OpenClaw version

2026.4.27

Operating system

Ubuntu / Linux

Install method

npm/global install

Model

github-copilot/gpt-5.4

Provider / routing chain

github-copilot -> gpt-5.4 (openai-responses)

Additional provider/model setup details

The Copilot token itself was healthy:

  • GET /models returned HTTP 200 with 37 models in about 1 second.
  • The token had just been refreshed and had about 28 minutes until expiry.

Logs, screenshots, and evidence

Direct reproduction using the same cached Copilot token against the same endpoint the adapter uses:

curl -sS -N -o /tmp/out -w "HTTP %{http_code} total=%{time_total}s\n" \
  --max-time 15 \
  -H "Authorization: Bearer $TOK" \
  -H "Accept: text/event-stream" \
  -H "Content-Type: application/json" \
  -H "Copilot-Integration-Id: vscode-chat" \
  -H "Editor-Version: vscode/1.99.0" \
  -H "OpenAI-Intent: conversation-panel" \
  https://api.githubcopilot.com/responses \
  --data '{"model":"gpt-5.4","input":"ping","stream":true}'

Observed:

HTTP 429 total=0.968s
Sorry, you've exceeded your weekly rate limit. Please review our [Terms of Service](https://docs.github.com/en/site-policy/github-terms/github-terms-of-service).

For comparison, non-streaming chat completions also returns immediately:

curl -sS -o /tmp/out -w "HTTP %{http_code} total=%{time_total}s\n" \
  --max-time 130 \
  -H "Authorization: Bearer $TOK" \
  -H "Content-Type: application/json" \
  -H "Copilot-Integration-Id: vscode-chat" \
  -H "Editor-Version: vscode/1.99.0" \
  -H "OpenAI-Intent: conversation-panel" \
  https://api.githubcopilot.com/chat/completions \
  --data '{"model":"gpt-5.4","messages":[{"role":"user","content":"ping"}],"stream":false}'

Observed:

HTTP 429 total=0.856s
Sorry, you've exceeded your weekly rate limit. Please review our [Terms of Service](https://docs.github.com/en/site-policy/github-terms/github-terms-of-service).

The provider response is plain text, not SSE.

Impact and severity

  • An immediate upstream provider error is reported as timedOut + idleTimedOut, which is misleading during debugging.
  • User-visible latency becomes two full idle-timeout windows, roughly 4 minutes by default, instead of about 1 second.
  • Retrying an exhausted-quota 429 is counterproductive and may worsen rate-limit pressure.
  • The duplicated promptError makes the trajectory harder to interpret.

Additional information

Installed bundle pointers from the affected environment:

  • dist/selection-8xKkwZC_.js: streamWithIdleTimeout, resolveLlmIdleTimeoutMs, DEFAULT_LLM_IDLE_TIMEOUT_MS = 120_000.
  • dist/plugin-sdk/src/agents/pi-embedded-runner/run/llm-idle-timeout.d.ts: idle-timeout declarations.
  • dist/extensions/github-copilot/stream.js and dist/stream-B6dB0lb1.js: Copilot provider wrapper.

My reading of the failure mode is:

  1. The provider returns HTTP 429 with a non-SSE plain-text body.
  2. The streaming layer returns an async stream object, but its iterator neither yields nor throws the provider HTTP error.
  3. streamWithIdleTimeout has no HTTP status visibility, so it waits 120 seconds and throws the generic idle-timeout error.
  4. The retry layer treats this as a timeout and retries the same exhausted-quota call.

The minimum expected fix is for non-2xx provider responses to be converted into a typed provider/protocol error before the stream reaches the idle watchdog. HTTP status and body should be preserved in the trajectory, and 4xx responses should not be retried as idle timeouts.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions