What happened?
When a provider (like opencode-go or other OpenAI-compatible endpoints) returns a 429 Rate Limit error (e.g., "5-hour usage limit reached") and drops the connection abruptly, the agent hangs indefinitely in a "Working" state instead of displaying the error.
This regression stems from commit 0852303 (fix(coding-agent): align undici fetch globals). Calling installUndiciGlobals() configures the global fetch with bodyTimeout: 0 (which is necessary for streaming). However, this causes response.text() inside the openai-node client to hang forever when the remote server closes the connection without a proper HTTP chunk terminator during an error response.
Additionally, AgentSession currently classifies hard quota limits (like exhausted balances) as retryable errors, which traps the agent in an infinite loop of hanging requests if the provider tarpits the retries.
Steps to reproduce
Configure an OpenAI-compatible provider (e.g., opencode-go).
Deliberately exhaust the API quota or balance so the provider returns a 429 Too Many Requests (specifically one that drops the connection abruptly).
Send a prompt to the agent.
The agent gets permanently stuck on "Working..." and never surfaces the actual API error in the UI.
Expected behavior
The agent should enforce a hard timeout on error response body extraction to prevent the stream parser from hanging. It should immediately surface the 429 error message (e.g., "Error: 429 5-hour usage limit reached") in the TUI, and it should not auto-retry hard quota or balance exhaustion limits.
Version
0.75.1
What happened?
When a provider (like opencode-go or other OpenAI-compatible endpoints) returns a 429 Rate Limit error (e.g., "5-hour usage limit reached") and drops the connection abruptly, the agent hangs indefinitely in a "Working" state instead of displaying the error.
This regression stems from commit 0852303 (fix(coding-agent): align undici fetch globals). Calling installUndiciGlobals() configures the global fetch with bodyTimeout: 0 (which is necessary for streaming). However, this causes response.text() inside the openai-node client to hang forever when the remote server closes the connection without a proper HTTP chunk terminator during an error response.
Additionally, AgentSession currently classifies hard quota limits (like exhausted balances) as retryable errors, which traps the agent in an infinite loop of hanging requests if the provider tarpits the retries.
Steps to reproduce
Configure an OpenAI-compatible provider (e.g., opencode-go).
Deliberately exhaust the API quota or balance so the provider returns a 429 Too Many Requests (specifically one that drops the connection abruptly).
Send a prompt to the agent.
The agent gets permanently stuck on "Working..." and never surfaces the actual API error in the UI.
Expected behavior
The agent should enforce a hard timeout on error response body extraction to prevent the stream parser from hanging. It should immediately surface the 429 error message (e.g., "Error: 429 5-hour usage limit reached") in the TUI, and it should not auto-retry hard quota or balance exhaustion limits.
Version
0.75.1