🔨 chore: wire Gateway-mode stop via direct tRPC interrupt#13815
Conversation
Frontend half of [LOBE-7142](https://linear.app/lobehub/issue/LOBE-7142) — the stop button previously silently failed in Gateway mode because: 1. `stopGenerateMessage` only filtered `execAgentRuntime`, so `execServerAgentRuntime` ops (Gateway) were skipped. 2. Even if the local op got cancelled, nothing bridged the cancel to the server-side agent loop running behind the Agent Gateway WS. ## Changes **`conversationControl.ts::stopGenerateMessage`** — extend the type filter to include both op types so both client-side and Gateway-mode runs are cancelled from the same entry point. **`gateway.ts::executeGatewayAgent` + `reconnectToGatewayOperation`** — register an `onOperationCancel` handler on the local `gatewayOpId` that forwards the server-side operation id to `interruptGatewayAgent(...)`, which sends `{ type: 'interrupt' }` over the Agent Gateway WS. The closure cleanly resolves the "local op id vs server op id" mapping — no metadata lookup needed. **`operation/actions.ts::cancelOperation`** — `isAborting` flag was gated on `execAgentRuntime`. Extend to `execServerAgentRuntime` too so the UI loading state transitions out immediately on Gateway-mode stop, without waiting for the round-trip `session_complete` from the server. ## What this doesn't do (follow-ups) - **Backend**: new `POST /api/agent/interrupt` route + Redis LPUSH (LOBE-7145). Without it, the WS interrupt reaches Agent Gateway but never gets forwarded to cloud. - **Agent loop**: `AgentRuntimeService.executeStep` LPOP polling of the interrupt key (LOBE-7146). Without it, the state never flips to `interrupted` server-side. - **Agent Gateway DO** (external repo): `_forwardInterrupt` HTTP POST from the WS interrupt handler (LOBE-7147). With only this PR merged, clicking stop will clear the local UI state and send the WS frame correctly — the server-side loop keeps running until those three are merged too. ## Tests - `conversationControl.test.ts`: +1 — stopGenerateMessage cancels `execServerAgentRuntime`, invokes the onCancel handler, sets `isAborting: true`. - `gateway.test.ts`: +1 — `executeGatewayAgent` registers a handler against the local opId, handler invokes `interruptGatewayAgent` with the server opId. All 123 touched-slice tests pass; type-check clean. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 68ed35f547
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| this.#get().onOperationCancel(gatewayOpId, () => { | ||
| this.interruptGatewayAgent(result.operationId); | ||
| }); |
There was a problem hiding this comment.
Retry interrupt when socket is still connecting
The cancel hook sends a single interruptGatewayAgent(...) call, but AgentStreamClient.sendInterrupt() only transmits when WebSocket.OPEN and otherwise drops the frame (sendMessage returns false). Because this return value is ignored, pressing Stop during the common connecting/authenticating window (right after executeGatewayAgent/reconnect) can silently miss the interrupt, so the server run continues despite local cancellation. Please queue or retry the interrupt until the gateway connection reaches connected (same pattern also appears in reconnectToGatewayOperation).
Useful? React with 👍 / 👎.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## canary #13815 +/- ##
==========================================
+ Coverage 66.57% 66.60% +0.02%
==========================================
Files 2027 2028 +1
Lines 172038 172239 +201
Branches 16763 17568 +805
==========================================
+ Hits 114532 114715 +183
- Misses 57382 57400 +18
Partials 124 124
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Rewiring only — no new behaviour on top of the previous commit. See the discussion in PR #13815 for the full reasoning. TL;DR the WS-based path (client → Agent Gateway WS → DO forwards HTTP → cloud route → Redis LPUSH → loop LPOP) has the same end-effect as the tRPC-direct path (client → tRPC → AgentRuntimeService .interruptOperation → DB state flip), except: - the tRPC path is one hop instead of three - the tRPC path reuses infrastructure that's *already on canary* — `aiAgentService.interruptTask` → `AiAgentService.interruptTask` → `AgentRuntimeService.interruptOperation` → `coordinator.saveAgentState` with status='interrupted' — and the existing step-boundary polling in `executeStep` (AgentRuntimeService.ts:474, 565) already picks it up - zero new server code required; zero Agent Gateway (external repo) coordination required The only reason the WS path was in the original spec (LOBE-7142) was symmetry with the Phase 6.4 tool_execute/tool_result path, but `interrupt` is a one-shot control signal, not stream data — there's no actual benefit to routing it through the same channel. Mid-step abort would require threading an AbortSignal into `runtime.step(...)`, which WS doesn't help with either. Closes out the need for LOBE-7145 / LOBE-7146 / LOBE-7147. Changes: - `gateway.ts`: both `executeGatewayAgent` and `reconnectToGatewayOperation` register the cancel handler against the local op id, but the handler body now calls `aiAgentService.interruptTask({ operationId: serverOpId })` via tRPC instead of `this.interruptGatewayAgent(serverOpId)` (which sent the WS interrupt frame). - `gateway.test.ts`: adjust the one new test case to verify the tRPC call rather than the WS-path spy; add `interruptTask` to the `aiAgentService` mock. `AgentStreamClient.sendInterrupt()` and `interruptGatewayAgent()` are kept as-is — public API, might be useful elsewhere. Just not called from the cancel handler anymore. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
# 🚀 LobeHub v2.1.50 (20260416) **Release Date:** April 16, 2026\ **Since v2.1.49:** 107 commits · 101 merged PRs · 13 contributors > This weekly release focuses on improving runtime stability and gateway execution consistency, while making Home/Recents workflows faster to navigate and easier to manage in daily use. --- ## ✨ Highlights - **Server-side Human Approval Flow** — Agent runtime now supports more reliable approve/reject/reject-continue handling in gateway mode, reducing stalled execution paths in long-running tasks. (#13829, #13863, #13873) - **Message Gateway End-to-End Hardening** — Gateway message flow, queue handling, tool callback routing, and stop interruption behavior were strengthened for better execution continuity. (#13761, #13816, #13820, #13815) - **Client Tool Execution in Gateway Mode** — Client-executor tools now run more predictably across gateway and desktop callers, with improved executor dispatch behavior. (#13792, #13790) - **Home / Recents / Sidebar Upgrade** — Sidebar layout, custom sort, recents operations, and profile actions were improved to reduce navigation friction in active sessions. (#13719, #13812, #13723, #13739, #13878, #13734) - **Agent Workspace and Documents Expansion** — Working panel and agent document workflows were expanded and polished for better day-to-day agent operations. (#13766, #13857) - **Provider and Model Compatibility Improvements** — Added GLM-5.1 support and refined model/provider edge-case handling, including schema and error-path fixes. (#13757, #13806, #13736, #13740) --- ## 🏗️ Core Agent & Architecture ### Agent runtime and intervention lifecycle - Added server-side human approval and improved runtime coordination across approve/reject decision paths. (#13829, #13863) - Improved interrupted-task handling and operation lifecycle consistency to reduce half-finished runtime states. (#13714) - Refined error classification and payload propagation so downstream surfaces receive clearer actionable errors. (#13736, #13740) ### Execution model and dispatch behavior - Introduced executor-aware runtime behavior to better separate client/server tool execution semantics. (#13758) - Improved tool/plugin resolution and manifest handling to avoid runtime failures on malformed inputs. (#13856, #13840, #13807) --- ## 📱 Gateway & Platform Integrations - Added message gateway support and strengthened queue/error behavior for more stable cross-channel execution. (#13761, #13816, #13820) - Improved gateway callback pipeline with protocol and API additions for `tool_execute` / `tool_result`. (#13762, #13764, #13765) - Improved bot/channel reliability and DM/slash handling in Discord-related paths. (#13805, #13724) --- ## 🖥️ CLI & User Experience - Improved CLI reliability across message/topic operations and build/minify-related paths. (#13731, #13888) - Added image-to-video options and improved command behavior for generation workflows. (#13788) - Improved desktop runtime behavior for remote fetch and Linux notification urgency handling. (#13789, #13782) --- ## 🔧 Tooling - Extracted gateway stream client into `@lobechat/agent-gateway-client` to centralize protocol usage and reduce duplication. (#13866) - Improved built-in tool coverage and runtime support, including GTD server runtime and missing lobe-kb tools. (#13854, #13876) - Updated skill and frontmatter consistency in workflow tooling. (#13730) --- ## 🔒 Security & Reliability - **Security:** Strengthened API key WS auth behavior and safer serverUrl forwarding in gateway-related auth paths. (#13824) - **Reliability:** Reduced runtime stalls by improving gateway stop/interrupt and approval-state routing behavior. (#13815, #13863, #13873) - **Reliability:** Added defensive guards for malformed tool manifests and non-string content edge cases. (#13856, #13753) --- ## 👥 Contributors **101 merged PRs** from **13 contributors** across **107 commits**. ### Community Contributors - @arvinxx - Runtime, gateway, and execution reliability improvements - @Innei - Navigation, workflow UX, and desktop/CLI refinements - @rdmclin2 - Sidebar, recents, and channel behavior updates - @ONLY-yours - Tooling/runtime fixes and model execution compatibility - @tjx666 - Model support and release/tooling maintenance - @nekomeowww - Memory and search-path stability fixes - @cy948 - CLI indexing and command flow fixes - @octo-patch - Local system runtime edge-case fixes - @djthread - Desktop runtime request reliability improvements - @rivertwilight - Documentation and changelog updates - @sudongyuer - Subscription/mobile support improvements - @Zhouguanyang - Provider/model configuration correctness fixes - @lobehubbot - Translation and maintenance automation support --- **Full Changelog**: v2.1.49...v2.1.50
Background
Client-side fix for the parent issue LOBE-7142 (Gateway 模式 Stop / Interrupt 实现).
Before this PR, clicking UI stop during a Gateway-mode (`execServerAgentRuntime`) run silently did nothing — the local op filter was `type: 'execAgentRuntime'` only, and there was no bridge from local cancellation to the server-side agent loop.
Root cause
All the server plumbing is already on canary. The only thing missing was calling `aiAgentService.interruptTask({ operationId })` from the client at the right moment.
What this PR does
`conversationControl.ts::stopGenerateMessage`
Extend the type filter so both `execAgentRuntime` (client-side) and `execServerAgentRuntime` (Gateway) ops are cancelled by a single call.
`gateway.ts::executeGatewayAgent` + `reconnectToGatewayOperation`
Register an `onOperationCancel` handler on the local `gatewayOpId`. When the local op is cancelled (e.g. user clicks stop → `cancelOperations`), the handler fires `aiAgentService.interruptTask({ operationId: result.operationId })` — passing the server-side operation id captured in closure.
The tRPC round-trip triggers `AgentRuntimeService.interruptOperation`, which flips the DB state to `'interrupted'`. The running agent loop's existing step-boundary polling picks it up on the next boundary and short-circuits. No new server code, no new routes, no Agent Gateway changes needed.
`operation/actions.ts::cancelOperation`
The existing `isAborting` metadata flag was only set for `execAgentRuntime`. Extend to `execServerAgentRuntime` so the UI loading state transitions out immediately on stop, without waiting for the tRPC round-trip to resolve or for the server to emit `session_complete`.
Why direct tRPC instead of WS interrupt (as the parent spec initially proposed)
The original LOBE-7142 spec mirrored the LOBE-7134 `tool_result` callback pattern: UI → WS → Agent Gateway DO → HTTP → cloud route → Redis LPUSH → agent loop LPOP.
That turned out to be the wrong pattern for `interrupt`:
The only argument for WS was "symmetry with tool_execute/tool_result", but those are stream-like payloads mid-execution while `interrupt` is a one-shot control signal — there's no benefit to routing it through the same channel. Mid-step abort (e.g. closing an in-flight LLM HTTP stream) would require threading an AbortSignal into `runtime.step(...)`, which WS doesn't help with either.
This removes the need for sub-issues LOBE-7145 (new route), LOBE-7146 (Redis LPOP), and LOBE-7147 (Agent Gateway DO forwarding) — all being closed as not-planned.
`AgentStreamClient.sendInterrupt()` and `interruptGatewayAgent()` are kept as public API but no longer called from the cancel flow. Dead-code removal is out of scope here; can be a separate cleanup if desired.
Test plan
🤖 Generated with Claude Code