[Bug] Stuck '思考中' state when provider/model is misconfigured (no connect timeout)

### What happened?

When a session is submitted against an invalid provider/model combination — concrete repro: `providerID: alibaba-coding-plan-cn` + `modelID: glm-5` (the alibaba coding plan gateway serves Qwen-family models, GLM is a Zhipu product, so the pair has no valid upstream route) — the request silently fails to produce any stream events. The assistant message is created with empty `parts`, no `time.completed`, no `info.error`, and no `llm_trace` is recorded. From the UI the session stays at "思考中" indefinitely while `session_status` remains `busy`. Users perceive it as the model erroring out, but there is no surface to recover from — they can't tell if it's still loading or stuck, and they can't even cancel it from a normal Stop press if the composer is hidden behind the busy spinner.

### Steps to reproduce

1. Configure a provider/model pair that resolves locally but is rejected by the upstream gateway (e.g. add `glm-5` modelID under `alibaba-coding-plan-cn` provider).
2. Submit any prompt with that pair selected.
3. Observe: composer flips to busy, "思考中" appears, and stays forever. No error toast, no error message body, no provider error event.

### Expected

If the first stream event doesn't arrive within a reasonable connect timeout (e.g. 30 s), the session should mark the assistant message as `error` with a clear provider-rejected reason and flip `session_status` back to `idle`, so the user can switch model or retry.

### Diagnostics

In the attached session export (`pawwork-session-stellar-pixel-2026-05-11-10-53-32.json`) the second user submit at 10:50:57.589 used `provider: alibaba-coding-plan-cn` + `model: glm-5`. The corresponding assistant message (`msg_e16a958f70017yxOn0NXKAT7ut`) has:

- `time.created: 1778496657655` but no `time.completed`
- `parts: []` — no `step-start`, `reasoning`, `text`, or `tool` parts
- no `info.error` field
- no `diagnostics.llm_trace` recorded (compare against the GPT message in the same session, which has a full trace including `aborted: true`)
- export captured at 10:53:32 — 2 m 35 s after submit, still no resolution

`runtime_context.model_refs` lists `alibaba-coding-plan-cn/glm-5` as `resolved: true`, but `resolved: true` only means the local config layer registered the entry, not that the upstream gateway accepts the model id. There is no upstream validation gate before the first real request.

The existing `SILENT_STREAM_TIMEOUT_MS` (`packages/opencode/src/session/llm.ts:30`, default 10 min) was designed for "stream stalled after producing events" — it doesn't fire when the stream never produces any event at all, which is the failure mode here.

### Suggested fixes

1. **Connect-timeout watchdog** in `packages/opencode/src/session/llm.ts`. Separate from `SILENT_STREAM_TIMEOUT_MS`, arm a shorter timer (e.g. 30 s) that fires if no provider event has arrived since dispatch. On fire, write an `APIError("provider did not respond")` onto the assistant message and flip status to idle.
2. **Boot-time validation** of provider/model pairs in `runtime_context.model_refs`. `resolved: true` should require at least a successful capabilities probe, or be downgraded to `registered: true` with `validated: false` so an invalid pair surfaces before the user submits a real prompt against it.
3. **Status reconcile**: if no `llm_trace` and no message parts are written within a wall-clock budget, the session_status watchdog should release the `busy` lock so the composer doesn't trap the user in 思考中.

### Environment

- PawWork version: local build (session export `runtime_context.app_version: "local"`)
- OS: macOS 15.x (Darwin 25.3.0)
- Reproducibility: Yes, every time (with the invalid provider/model combo)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Stuck '思考中' state when provider/model is misconfigured (no connect timeout) #554

What happened?

Steps to reproduce

Expected

Diagnostics

Suggested fixes

Environment

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Bug] Stuck '思考中' state when provider/model is misconfigured (no connect timeout) #554

Description

What happened?

Steps to reproduce

Expected

Diagnostics

Suggested fixes

Environment

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions