[Daemon] RFC: POST /prompt should be non-blocking — decouple trigger from completion

## Summary

The daemon's `POST /session/:id/prompt` endpoint currently holds the HTTP connection open until the entire agent turn completes (model inference + tool execution + multi-step agentic loop). This synchronous blocking design conflicts with common infrastructure timeout constraints and creates reliability issues in real-world deployments.

## Current Design

```
Client                              Daemon
  |                                   |
  |--- POST /prompt ----------------->|
  |         (connection held open)    |  ← model inference
  |                                   |  ← tool execution
  |                                   |  ← more model calls...
  |                                   |  ← could take 2-10+ minutes
  |<-- 200 { stopReason } -----------|
```

Meanwhile, real-time data (assistant text chunks, tool calls, tool output) is already delivered independently via the SSE `GET /session/:id/events` stream. The `/prompt` HTTP response only carries `{ stopReason }` — effectively just a completion signal.

## Problem

In HTTP-based deployments (web IDE, remote daemon access), the request passes through standard infrastructure layers (reverse proxies, ingress controllers, load balancers). These layers universally enforce ~60s `proxy_read_timeout` on regular HTTP requests — this is an industry-standard default, not a misconfiguration.

When an agent turn exceeds 60s:
- The intermediate proxy returns **504 Gateway Timeout** to the client
- The daemon continues executing normally (unaware of the disconnection)
- The client loses the `stopReason` completion signal
- There is **no alternative way** to learn that the turn has finished, because no `turn_complete` event exists in the SSE protocol

SSE connections are exempt from this timeout (via `X-Accel-Buffering: no`, heartbeat frames, dedicated proxy config), but regular HTTP POST requests are not — and shouldn't need to be.

## Design Issue

The `/prompt` endpoint conflates two distinct responsibilities:

1. **Trigger** — "start processing this prompt" (validation, queueing)
2. **Await completion** — "tell me when the turn is done and why it stopped"

Responsibility #2 is already better served by the SSE channel, which:
- Has built-in reconnection and heartbeat mechanisms
- Is already used for all intermediate state delivery
- Survives proxy timeouts by design

## Reference: ACP Streamable HTTP already solves this

The ACP HTTP transport (`/acp`, [PR #4472](https://github.com/QwenLM/qwen-code/pull/4472), [RFD #721](https://github.com/agentclientprotocol/agent-client-protocol/pull/721)) has already adopted the non-blocking pattern:

```
POST /acp { session/prompt } → 202 (immediate, empty body)
GET  /acp (session-scoped)   ← SSE: session/update notifications
                             ← SSE: { id, result: { stop_reason } }  (completion)
```

This works because:
- POST takes <1s (no proxy timeout risk)
- SSE has 15s heartbeat + `X-Accel-Buffering: no` (proxies don't kill it)
- Completion signal travels via SSE alongside streaming data

**However**, ACP HTTP is still a draft proposal with an incomplete implementation. It should NOT be treated as the migration target today. Instead, this issue proposes applying the **same architectural pattern** to the existing REST API surface, so both transports can run independently side by side.

## Proposed Change

Apply the ACP-consistent non-blocking pattern to the existing REST API, without changing the URL surface:

### 1. Make `POST /session/:id/prompt` non-blocking

```
Client                              Daemon
  |                                   |
  |--- POST /prompt ----------------->|
  |<-- 202 { promptId } -------------|  ← immediate (< 1s)
  |                                   |
  |  (agent turn runs asynchronously) |
```

The endpoint validates the request, confirms the prompt is accepted, and returns immediately. Errors in prompt submission (invalid session, busy, malformed input) are still returned synchronously as 4xx.

### 2. Add `turn_complete` event to existing `GET /session/:id/events` SSE stream

```
Client (SSE)                        Daemon
  |                                   |
  | ... session_update events ...     |
  |<-- turn_complete { stopReason } --|  ← agent turn finished
```

All SSE subscribers (prompt sender + passive observers) receive this event, providing a single authoritative completion signal. This also eliminates the current 3-second inactivity heuristic that passive observers use as a workaround.

### 3. SDK backward compatibility

`DaemonClient.prompt()` retains its `Promise<PromptResult>` signature. Internally, it becomes: POST (fire) → await matching `turn_complete` event on SSE → resolve. Callers see no breaking change.

### 4. Coexistence with `/acp`

Both transports share the same `Bridge` instance and `EventBus`. The change is purely at the REST transport layer:

```
                    ┌─────────────────────────────┐
                    │       Bridge + EventBus      │
                    └──────┬──────────┬────────────┘
                           │          │
              ┌────────────▼──┐  ┌────▼────────────┐
              │  REST API     │  │  ACP HTTP (/acp) │
              │  /session/*   │  │  (RFD #721)      │
              │  (this issue) │  │  (already done)   │
              └───────────────┘  └──────────────────┘
```

No dependency between the two; either can be enabled/disabled independently.

## Components Affected

| Component | Change |
|-----------|--------|
| `packages/cli/src/serve/server.ts` | `/prompt` route returns 202 immediately; on turn end, publishes `turn_complete` to EventBus |
| `packages/acp-bridge/src/bridge.ts` | Emit `turn_complete` / `turn_error` event when `sendPrompt` promise settles |
| `packages/sdk-typescript/src/daemon/events.ts` | Add `turn_complete`, `turn_error` event types |
| `packages/sdk-typescript/src/daemon/DaemonClient.ts` | `prompt()` internally awaits SSE `turn_complete` event instead of HTTP response |
| `packages/webui/` | Remove 3s inactivity heuristic; use `turn_complete` event uniformly |

## Additional Evidence

The passive observer (multi-tab) scenario already reveals this gap. When a client subscribes to SSE without being the prompt sender, it has **no reliable way** to know when the turn ends. The current webui uses a 3-second inactivity heuristic (`schedulePassiveAssistantDone`) — a clear workaround for the missing completion signal.

## Discussion Points

- Should the non-blocking behavior be opt-in (header/query param) for backward compatibility during transition?
- If SSE disconnects during a turn, should there be a `GET /session/:id/prompt-status` endpoint for recovery?
- The local CLI `qwen serve` scenario has no proxy timeout issue — is there value in non-blocking there too (e.g., client disconnect tolerance)?

## Prior Art

- **ACP Streamable HTTP** (this repo, `/acp`) — already implements this exact pattern
- **OpenAI Assistants API** — create run → stream events
- **GitHub Actions API** — queue job → poll/webhook for result
- Industry standard for any async job system (Celery, Temporal, etc.)

The current blocking design made sense when the daemon was local-only. As it increasingly serves remote/web clients through standard HTTP infrastructure, the blocking model becomes a liability.

Component	Change
`packages/cli/src/serve/server.ts`	`/prompt` route returns 202 immediately; on turn end, publishes `turn_complete` to EventBus
`packages/acp-bridge/src/bridge.ts`	Emit `turn_complete` / `turn_error` event when `sendPrompt` promise settles
`packages/sdk-typescript/src/daemon/events.ts`	Add `turn_complete`, `turn_error` event types
`packages/sdk-typescript/src/daemon/DaemonClient.ts`	`prompt()` internally awaits SSE `turn_complete` event instead of HTTP response
`packages/webui/`	Remove 3s inactivity heuristic; use `turn_complete` event uniformly

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Daemon] RFC: POST /prompt should be non-blocking — decouple trigger from completion #4582

Summary

Current Design

Problem

Design Issue

Reference: ACP Streamable HTTP already solves this

Proposed Change

1. Make `POST /session/:id/prompt` non-blocking

2. Add `turn_complete` event to existing `GET /session/:id/events` SSE stream

3. SDK backward compatibility

4. Coexistence with `/acp`

Components Affected

Additional Evidence

Discussion Points

Prior Art

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Daemon] RFC: POST /prompt should be non-blocking — decouple trigger from completion #4582

Description

Summary

Current Design

Problem

Design Issue

Reference: ACP Streamable HTTP already solves this

Proposed Change

1. Make POST /session/:id/prompt non-blocking

2. Add turn_complete event to existing GET /session/:id/events SSE stream

3. SDK backward compatibility

4. Coexistence with /acp

Components Affected

Additional Evidence

Discussion Points

Prior Art

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

1. Make `POST /session/:id/prompt` non-blocking

2. Add `turn_complete` event to existing `GET /session/:id/events` SSE stream

4. Coexistence with `/acp`