Skip to content

🔨 chore: wire Gateway-mode stop via direct tRPC interrupt#13815

Merged
arvinxx merged 2 commits into
canaryfrom
feat/lobe-7142-gateway-stop-interrupt
Apr 14, 2026
Merged

🔨 chore: wire Gateway-mode stop via direct tRPC interrupt#13815
arvinxx merged 2 commits into
canaryfrom
feat/lobe-7142-gateway-stop-interrupt

Conversation

@arvinxx

@arvinxx arvinxx commented Apr 14, 2026

Copy link
Copy Markdown
Member

Background

Client-side fix for the parent issue LOBE-7142 (Gateway 模式 Stop / Interrupt 实现).

Before this PR, clicking UI stop during a Gateway-mode (`execServerAgentRuntime`) run silently did nothing — the local op filter was `type: 'execAgentRuntime'` only, and there was no bridge from local cancellation to the server-side agent loop.

Root cause

Layer State
`stopGenerateMessage` filter only matched `execAgentRuntime`
Server-side `interruptTask` tRPC endpoint ✅ already exists — `AiAgentService.interruptTask` → `AgentRuntimeService.interruptOperation` → `coordinator.saveAgentState` (`status='interrupted'`)
Server-side step-boundary polling ✅ already exists — `executeStep` at `AgentRuntimeService.ts:474` and `:565` checks state and early-returns on `'interrupted'`
Client-side bridge ❌ missing

All the server plumbing is already on canary. The only thing missing was calling `aiAgentService.interruptTask({ operationId })` from the client at the right moment.

What this PR does

`conversationControl.ts::stopGenerateMessage`

Extend the type filter so both `execAgentRuntime` (client-side) and `execServerAgentRuntime` (Gateway) ops are cancelled by a single call.

`gateway.ts::executeGatewayAgent` + `reconnectToGatewayOperation`

Register an `onOperationCancel` handler on the local `gatewayOpId`. When the local op is cancelled (e.g. user clicks stop → `cancelOperations`), the handler fires `aiAgentService.interruptTask({ operationId: result.operationId })` — passing the server-side operation id captured in closure.

The tRPC round-trip triggers `AgentRuntimeService.interruptOperation`, which flips the DB state to `'interrupted'`. The running agent loop's existing step-boundary polling picks it up on the next boundary and short-circuits. No new server code, no new routes, no Agent Gateway changes needed.

`operation/actions.ts::cancelOperation`

The existing `isAborting` metadata flag was only set for `execAgentRuntime`. Extend to `execServerAgentRuntime` so the UI loading state transitions out immediately on stop, without waiting for the tRPC round-trip to resolve or for the server to emit `session_complete`.

Why direct tRPC instead of WS interrupt (as the parent spec initially proposed)

The original LOBE-7142 spec mirrored the LOBE-7134 `tool_result` callback pattern: UI → WS → Agent Gateway DO → HTTP → cloud route → Redis LPUSH → agent loop LPOP.

That turned out to be the wrong pattern for `interrupt`:

Dimension Via Agent Gateway WS (spec) Direct tRPC (this PR)
Hops 3 (client → Gateway → cloud) 1
New server code new route + Redis LPUSH/LPOP in `executeStep` zero
External repo coord Agent Gateway DO `_forwardInterrupt` zero
Cross-replica safety Redis key DB state (already cross-replica)
Abort granularity step boundary step boundary (identical)

The only argument for WS was "symmetry with tool_execute/tool_result", but those are stream-like payloads mid-execution while `interrupt` is a one-shot control signal — there's no benefit to routing it through the same channel. Mid-step abort (e.g. closing an in-flight LLM HTTP stream) would require threading an AbortSignal into `runtime.step(...)`, which WS doesn't help with either.

This removes the need for sub-issues LOBE-7145 (new route), LOBE-7146 (Redis LPOP), and LOBE-7147 (Agent Gateway DO forwarding) — all being closed as not-planned.

`AgentStreamClient.sendInterrupt()` and `interruptGatewayAgent()` are kept as public API but no longer called from the cancel flow. Dead-code removal is out of scope here; can be a separate cleanup if desired.

Test plan

  • `conversationControl.test.ts` — new case: `stopGenerateMessage` cancels `execServerAgentRuntime`, invokes the registered handler, sets `isAborting: true`.
  • `gateway.test.ts` — new case: `executeGatewayAgent` registers the handler against the local opId, and when invoked, the handler calls `aiAgentService.interruptTask` with the server opId.
  • 123 / 123 touched-slice tests pass; type-check clean.
  • Live canary E2E: trigger a long Gateway-mode conversation, click stop, verify tRPC call fires + server operation flips to `interrupted` + UI loading clears (running right after this update).

🤖 Generated with Claude Code

Frontend half of [LOBE-7142](https://linear.app/lobehub/issue/LOBE-7142)
— the stop button previously silently failed in Gateway mode because:

1. `stopGenerateMessage` only filtered `execAgentRuntime`, so
   `execServerAgentRuntime` ops (Gateway) were skipped.
2. Even if the local op got cancelled, nothing bridged the cancel to
   the server-side agent loop running behind the Agent Gateway WS.

## Changes

**`conversationControl.ts::stopGenerateMessage`** — extend the type
filter to include both op types so both client-side and Gateway-mode
runs are cancelled from the same entry point.

**`gateway.ts::executeGatewayAgent` + `reconnectToGatewayOperation`** —
register an `onOperationCancel` handler on the local `gatewayOpId` that
forwards the server-side operation id to `interruptGatewayAgent(...)`,
which sends `{ type: 'interrupt' }` over the Agent Gateway WS. The
closure cleanly resolves the "local op id vs server op id" mapping —
no metadata lookup needed.

**`operation/actions.ts::cancelOperation`** — `isAborting` flag was
gated on `execAgentRuntime`. Extend to `execServerAgentRuntime` too so
the UI loading state transitions out immediately on Gateway-mode stop,
without waiting for the round-trip `session_complete` from the server.

## What this doesn't do (follow-ups)

- **Backend**: new `POST /api/agent/interrupt` route + Redis LPUSH
  (LOBE-7145). Without it, the WS interrupt reaches Agent Gateway but
  never gets forwarded to cloud.
- **Agent loop**: `AgentRuntimeService.executeStep` LPOP polling of the
  interrupt key (LOBE-7146). Without it, the state never flips to
  `interrupted` server-side.
- **Agent Gateway DO** (external repo): `_forwardInterrupt` HTTP POST
  from the WS interrupt handler (LOBE-7147).

With only this PR merged, clicking stop will clear the local UI state
and send the WS frame correctly — the server-side loop keeps running
until those three are merged too.

## Tests

- `conversationControl.test.ts`: +1 — stopGenerateMessage cancels
  `execServerAgentRuntime`, invokes the onCancel handler, sets
  `isAborting: true`.
- `gateway.test.ts`: +1 — `executeGatewayAgent` registers a handler
  against the local opId, handler invokes `interruptGatewayAgent`
  with the server opId.

All 123 touched-slice tests pass; type-check clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@vercel

vercel Bot commented Apr 14, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
lobehub Ready Ready Preview, Comment Apr 14, 2026 2:02pm

Request Review

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've reviewed this pull request using the Sourcery rules engine

@arvinxx arvinxx changed the title ✨ feat: wire Gateway-mode stop button to WS interrupt (client side) 🔨 chore: wire Gateway-mode stop button to WS interrupt (client side) Apr 14, 2026

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 68ed35f547

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +282 to +284
this.#get().onOperationCancel(gatewayOpId, () => {
this.interruptGatewayAgent(result.operationId);
});

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Retry interrupt when socket is still connecting

The cancel hook sends a single interruptGatewayAgent(...) call, but AgentStreamClient.sendInterrupt() only transmits when WebSocket.OPEN and otherwise drops the frame (sendMessage returns false). Because this return value is ignored, pressing Stop during the common connecting/authenticating window (right after executeGatewayAgent/reconnect) can silently miss the interrupt, so the server run continues despite local cancellation. Please queue or retry the interrupt until the gateway connection reaches connected (same pattern also appears in reconnectToGatewayOperation).

Useful? React with 👍 / 👎.

@codecov

codecov Bot commented Apr 14, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 58.33333% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 66.60%. Comparing base (116495b) to head (c1330e9).
⚠️ Report is 3 commits behind head on canary.

Additional details and impacted files
@@            Coverage Diff             @@
##           canary   #13815      +/-   ##
==========================================
+ Coverage   66.57%   66.60%   +0.02%     
==========================================
  Files        2027     2028       +1     
  Lines      172038   172239     +201     
  Branches    16763    17568     +805     
==========================================
+ Hits       114532   114715     +183     
- Misses      57382    57400      +18     
  Partials      124      124              
Flag Coverage Δ
app 58.72% <58.33%> (+0.05%) ⬆️
database 92.46% <ø> (ø)
packages/agent-runtime 79.72% <ø> (ø)
packages/context-engine 83.38% <ø> (ø)
packages/conversation-flow 92.36% <ø> (ø)
packages/file-loaders 87.02% <ø> (ø)
packages/memory-user-memory 74.74% <ø> (ø)
packages/model-bank 99.86% <ø> (ø)
packages/model-runtime 84.20% <ø> (ø)
packages/prompts 69.24% <ø> (ø)
packages/python-interpreter 92.90% <ø> (ø)
packages/ssrf-safe-fetch 0.00% <ø> (ø)
packages/utils 90.14% <ø> (ø)
packages/web-crawler 88.66% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
Store 65.90% <90.90%> (+0.16%) ⬆️
Services 52.19% <ø> (ø)
Server 66.27% <ø> (ø)
Libs 52.89% <100.00%> (+0.05%) ⬆️
Utils 91.12% <ø> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Rewiring only — no new behaviour on top of the previous commit. See
the discussion in PR #13815 for the full reasoning.

TL;DR the WS-based path (client → Agent Gateway WS → DO forwards
HTTP → cloud route → Redis LPUSH → loop LPOP) has the same end-effect
as the tRPC-direct path (client → tRPC → AgentRuntimeService
.interruptOperation → DB state flip), except:

- the tRPC path is one hop instead of three
- the tRPC path reuses infrastructure that's *already on canary* —
  `aiAgentService.interruptTask` → `AiAgentService.interruptTask` →
  `AgentRuntimeService.interruptOperation` → `coordinator.saveAgentState`
  with status='interrupted' — and the existing step-boundary polling
  in `executeStep` (AgentRuntimeService.ts:474, 565) already picks it up
- zero new server code required; zero Agent Gateway (external repo)
  coordination required

The only reason the WS path was in the original spec (LOBE-7142) was
symmetry with the Phase 6.4 tool_execute/tool_result path, but
`interrupt` is a one-shot control signal, not stream data — there's
no actual benefit to routing it through the same channel. Mid-step
abort would require threading an AbortSignal into `runtime.step(...)`,
which WS doesn't help with either.

Closes out the need for LOBE-7145 / LOBE-7146 / LOBE-7147.

Changes:
- `gateway.ts`: both `executeGatewayAgent` and
  `reconnectToGatewayOperation` register the cancel handler against
  the local op id, but the handler body now calls
  `aiAgentService.interruptTask({ operationId: serverOpId })` via
  tRPC instead of `this.interruptGatewayAgent(serverOpId)` (which sent
  the WS interrupt frame).
- `gateway.test.ts`: adjust the one new test case to verify the
  tRPC call rather than the WS-path spy; add `interruptTask` to the
  `aiAgentService` mock.

`AgentStreamClient.sendInterrupt()` and `interruptGatewayAgent()` are
kept as-is — public API, might be useful elsewhere. Just not called
from the cancel handler anymore.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@arvinxx arvinxx changed the title 🔨 chore: wire Gateway-mode stop button to WS interrupt (client side) 🔨 chore: wire Gateway-mode stop via direct tRPC interrupt Apr 14, 2026
@arvinxx arvinxx merged commit 18bc271 into canary Apr 14, 2026
32 of 33 checks passed
@arvinxx arvinxx deleted the feat/lobe-7142-gateway-stop-interrupt branch April 14, 2026 14:41
canisminor1990 added a commit that referenced this pull request Apr 16, 2026
# 🚀 LobeHub v2.1.50 (20260416)

**Release Date:** April 16, 2026\
**Since v2.1.49:** 107 commits · 101 merged PRs · 13 contributors

> This weekly release focuses on improving runtime stability and gateway
execution consistency, while making Home/Recents workflows faster to
navigate and easier to manage in daily use.

---

## ✨ Highlights

- **Server-side Human Approval Flow** — Agent runtime now supports more
reliable approve/reject/reject-continue handling in gateway mode,
reducing stalled execution paths in long-running tasks. (#13829, #13863,
#13873)

- **Message Gateway End-to-End Hardening** — Gateway message flow, queue
handling, tool callback routing, and stop interruption behavior were
strengthened for better execution continuity. (#13761, #13816, #13820,
#13815)

- **Client Tool Execution in Gateway Mode** — Client-executor tools now
run more predictably across gateway and desktop callers, with improved
executor dispatch behavior. (#13792, #13790)

- **Home / Recents / Sidebar Upgrade** — Sidebar layout, custom sort,
recents operations, and profile actions were improved to reduce
navigation friction in active sessions. (#13719, #13812, #13723, #13739,
#13878, #13734)

- **Agent Workspace and Documents Expansion** — Working panel and agent
document workflows were expanded and polished for better day-to-day
agent operations. (#13766, #13857)

- **Provider and Model Compatibility Improvements** — Added GLM-5.1
support and refined model/provider edge-case handling, including schema
and error-path fixes. (#13757, #13806, #13736, #13740)

---

## 🏗️ Core Agent & Architecture

### Agent runtime and intervention lifecycle

- Added server-side human approval and improved runtime coordination
across approve/reject decision paths. (#13829, #13863)
- Improved interrupted-task handling and operation lifecycle consistency
to reduce half-finished runtime states. (#13714)
- Refined error classification and payload propagation so downstream
surfaces receive clearer actionable errors. (#13736, #13740)

### Execution model and dispatch behavior

- Introduced executor-aware runtime behavior to better separate
client/server tool execution semantics. (#13758)
- Improved tool/plugin resolution and manifest handling to avoid runtime
failures on malformed inputs. (#13856, #13840, #13807)

---

## 📱 Gateway & Platform Integrations

- Added message gateway support and strengthened queue/error behavior
for more stable cross-channel execution. (#13761, #13816, #13820)
- Improved gateway callback pipeline with protocol and API additions for
`tool_execute` / `tool_result`. (#13762, #13764, #13765)
- Improved bot/channel reliability and DM/slash handling in
Discord-related paths. (#13805, #13724)

---

## 🖥️ CLI & User Experience

- Improved CLI reliability across message/topic operations and
build/minify-related paths. (#13731, #13888)
- Added image-to-video options and improved command behavior for
generation workflows. (#13788)
- Improved desktop runtime behavior for remote fetch and Linux
notification urgency handling. (#13789, #13782)

---

## 🔧 Tooling

- Extracted gateway stream client into `@lobechat/agent-gateway-client`
to centralize protocol usage and reduce duplication. (#13866)
- Improved built-in tool coverage and runtime support, including GTD
server runtime and missing lobe-kb tools. (#13854, #13876)
- Updated skill and frontmatter consistency in workflow tooling.
(#13730)

---

## 🔒 Security & Reliability

- **Security:** Strengthened API key WS auth behavior and safer
serverUrl forwarding in gateway-related auth paths. (#13824)
- **Reliability:** Reduced runtime stalls by improving gateway
stop/interrupt and approval-state routing behavior. (#13815, #13863,
#13873)
- **Reliability:** Added defensive guards for malformed tool manifests
and non-string content edge cases. (#13856, #13753)

---

## 👥 Contributors

**101 merged PRs** from **13 contributors** across **107 commits**.

### Community Contributors

- @arvinxx - Runtime, gateway, and execution reliability improvements
- @Innei - Navigation, workflow UX, and desktop/CLI refinements
- @rdmclin2 - Sidebar, recents, and channel behavior updates
- @ONLY-yours - Tooling/runtime fixes and model execution compatibility
- @tjx666 - Model support and release/tooling maintenance
- @nekomeowww - Memory and search-path stability fixes
- @cy948 - CLI indexing and command flow fixes
- @octo-patch - Local system runtime edge-case fixes
- @djthread - Desktop runtime request reliability improvements
- @rivertwilight - Documentation and changelog updates
- @sudongyuer - Subscription/mobile support improvements
- @Zhouguanyang - Provider/model configuration correctness fixes
- @lobehubbot - Translation and maintenance automation support

---

**Full Changelog**: v2.1.49...v2.1.50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant