Skip to content

✨ feat(agent-runtime): server callSubAgent async suspend/resume#15481

Merged
arvinxx merged 10 commits into
canaryfrom
feat/lobe-9763-subagent-bridge
Jun 6, 2026
Merged

✨ feat(agent-runtime): server callSubAgent async suspend/resume#15481
arvinxx merged 10 commits into
canaryfrom
feat/lobe-9763-subagent-bridge

Conversation

@arvinxx

@arvinxx arvinxx commented Jun 4, 2026

Copy link
Copy Markdown
Member

Summary

Implements server-side callSubAgent as a real async suspend/resume loop (LOBE-9763). Previously the server sub-agent path was fire-and-forget — the parent dispatched a child op and continued immediately with a "dispatched" acknowledgement, never seeing the sub-agent's actual answer. Now the parent parks (waiting_for_async_tool, no request held), the child op runs independently (QStash/local), and a completion bridge backfills the parent's tool message and resumes the parent so the LLM continues with the real result.

parent LLM → callSubAgent
  └─ lobeAgent executor returns { deferred: true } via injected ctx.subAgent runner
  └─ runner creates the pending placeholder tool message (anchors isolation thread) + forks child op
  └─ parent op → waiting_for_async_tool  (parked, no request held)
       ⋮ child op runs independently
  child op done → onComplete bridge:
       a. backfill the placeholder tool message (result / error + pluginState)
       b. barrier: every pendingToolsCalling tool fulfilled? (+ child-op reconcile)
       c. tryResumeFromAsyncTool() atomic CAS → single winner schedules resume
  parent resume step: refresh messages from DB → user_input → LLM continues → done

Key changes

  • waiting_for_async_tool parked state (Phase 1 foundation): new non-terminal status + isParkedStatus/isBlockedStatus; client_tool_execution migrated off interrupted.
  • Deferred-tool park rails in RuntimeExecutors (call_tool / call_tools_batch).
  • lobeAgent server runtime callSubAgent executor → returns deferred.
  • ctx.subAgent runner injected per tool-call (owns the parent-message anchor + child-op kickoff).
  • Completion bridge (aiAgent.execSubAgentTask onComplete) → backfill + AgentRuntimeService.tryResumeParentFromAsyncTool (barrier + single-fire CAS + schedule) + resumeAsyncTool re-entry in executeStep.
  • Bug fix: CompletionLifecycle.dispatchHooks previously fired onComplete + unregistered all hooks on the park; since the async resume reuses the same operationId, the real done never re-notified consumers. Now waiting_for_async_tool persists the parked status but keeps hooks registered and skips the premature onComplete.
  • Local queue now forwards payload to the execution callback (parity with the QStash body path).

Verification

  • Unit: lobeAgent.callSubAgent (deferred contract), tryResumeParentFromAsyncTool (barrier / CAS / no-double-resume), CompletionLifecycle async-park hook behavior.
  • Integration: serverSubAgent.integration.test.ts (in-memory runtime; runs in submodule CI).
  • End-to-end with a real LLM via cloud agent-evals (real tool exec, in-memory runtime): parent parks → child runs → bridge CAS resume → done, asserted on the backfilled tool message. Real-QStash cross-process run pending (separate cloud tooling PR adds an eval --queue mode for it).

Linear: LOBE-9763

@vercel

vercel Bot commented Jun 4, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
lobehub Ready Ready Preview, Comment Jun 6, 2026 11:11am

Request Review

@dosubot dosubot Bot added size:XL This PR changes 500-999 lines, ignoring generated files. feature:agent Assistant/Agent configuration and behavior feature:tool Tool calling and function execution labels Jun 4, 2026

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've reviewed this pull request using the Sourcery rules engine

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2e35104d51

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

approvedToolCall,
rejectionReason,
rejectAndContinue,
resumeAsyncTool,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Read async resume flags from queued payload

In the QStash path I checked (src/server/services/queue/impls/qstash.ts), scheduleMessage publishes extra fields under body.payload, but this handler only destructures resumeAsyncTool from the top-level body. The new sub-agent bridge schedules the parent resume as payload: { resumeAsyncTool: true }, so in production/QStash the resumed step receives resumeAsyncTool === undefined and never runs the new DB-refresh/clear-pending branch for a waiting_for_async_tool parent. Local queue was updated to spread payload, which masks this in local tests; the HTTP handler needs to unwrap/spread body.payload too.

Useful? React with 👍 / 👎.

@codecov

codecov Bot commented Jun 4, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 83.79121% with 59 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.69%. Comparing base (6f5a633) to head (5413206).
⚠️ Report is 2 commits behind head on canary.

Additional details and impacted files
@@            Coverage Diff             @@
##           canary   #15481      +/-   ##
==========================================
+ Coverage   70.64%   70.69%   +0.05%     
==========================================
  Files        3274     3275       +1     
  Lines      322959   323290     +331     
  Branches    29419    34247    +4828     
==========================================
+ Hits       228155   228563     +408     
+ Misses      94621    94544      -77     
  Partials      183      183              
Flag Coverage Δ
app 61.40% <86.58%> (+0.10%) ⬆️
database 92.47% <7.14%> (-0.08%) ⬇️
packages/agent-manager-runtime 49.69% <ø> (ø)
packages/agent-runtime 81.06% <100.00%> (+0.01%) ⬆️
packages/builtin-tool-lobe-agent 18.52% <ø> (ø)
packages/context-engine 84.19% <ø> (ø)
packages/conversation-flow 91.29% <ø> (ø)
packages/device-gateway-client 90.51% <ø> (ø)
packages/eval-dataset-parser 95.15% <ø> (ø)
packages/eval-rubric 76.11% <ø> (ø)
packages/fetch-sse 85.57% <ø> (-1.72%) ⬇️
packages/file-loaders 87.89% <ø> (ø)
packages/memory-user-memory 74.99% <ø> (ø)
packages/model-bank 99.99% <ø> (ø)
packages/model-runtime 84.22% <ø> (ø)
packages/prompts 72.51% <ø> (ø)
packages/python-interpreter 92.90% <ø> (ø)
packages/ssrf-safe-fetch 0.00% <ø> (ø)
packages/types 35.38% <ø> (ø)
packages/utils 84.98% <ø> (ø)
packages/web-crawler 88.08% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
Store 68.40% <100.00%> (+<0.01%) ⬆️
Services 54.77% <ø> (ø)
Server 72.02% <86.43%> (+0.19%) ⬆️
Libs 54.62% <ø> (+0.41%) ⬆️
Utils 81.71% <ø> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

arvinxx and others added 9 commits June 6, 2026 14:17
…ferred tools

Add a dedicated `waiting_for_async_tool` operation status that mirrors
`waiting_for_human` as a non-terminal, resumable pause, and migrate the
client-tool execution pause off `interrupted` onto it — so `interrupted`
once again means only user-initiated cancellation.

Also add the AgentOperationModel primitives the upcoming server sub-agent
bridge needs: queryByParentOperationId (reconcile child ops) and
tryResumeFromAsyncTool (atomic single-fire CAS).

Foundation for the server sub-agent suspend/resume mechanism (LOBE-9763).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…predicates

Replace the repeated `status === 'waiting_for_human' || ... === 'waiting_for_async_tool' || ... === 'interrupted'`
chains with named predicates so the parked/blocked semantics live in one place
(runtime step-loop break, completion lifecycle completedAt, executeSync pause,
operation isActive).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Full rename of the service method, its `ExecSubAgentTaskParams`/`ExecSubAgentTaskResult`
types, the tRPC endpoint, the injected `RuntimeExecutorContext`/`AgentRuntimeServiceOptions`
callback, and tests. Group-mode `execGroupSubAgent*` identifiers are intentionally left
untouched. Prep for the server sub-agent suspend/resume work (LOBE-9763).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Introduce a generic `deferred` result flag (BuiltinServerRuntimeOutput /
ToolExecutionResult). When a tool returns deferred, call_tool parks the
operation (waiting_for_async_tool + pendingToolsCalling) without writing a
tool_result — mirroring the client-tool pause — so the result can be
delivered out-of-band later by a completion bridge. Thread the existing
execSubAgentTask DI seam into ToolExecutionContext so async tools can spawn
a child op without a circular import.

Part of the server sub-agent suspend/resume mechanism (LOBE-9763).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Mirror the call_tool deferred-park on the parallel path: deferred (async)
tools are collected during the concurrent batch and, once server tools
settle, the operation parks (waiting_for_async_tool + pendingToolsCalling)
alongside any client tools — so K parallel sub-agents in one round all
resolve before the parent resumes.

Part of the server sub-agent suspend/resume mechanism (LOBE-9763).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Turn the server `callSubAgent` path from fire-and-forget into a real
deferred-tool suspend/resume loop (LOBE-9763 Phase 2):

- lobeAgent server runtime: add `callSubAgent` executor returning a
  `deferred` result via an injected `ctx.subAgent` runner
- RuntimeExecutors: build a per-tool-call server sub-agent runner that
  creates the pending placeholder tool message (anchoring the isolation
  thread) and kicks off the child op
- aiAgent.execSubAgentTask: register an onComplete bridge hook that
  backfills the placeholder and resumes the parent
- AgentRuntimeService: `tryResumeParentFromAsyncTool` (barrier over
  pendingToolsCalling + single-fire CAS + schedule), `refreshMessagesFromDB`,
  and the `resumeAsyncTool` branch in executeStep
- queue/local: forward `payload` to the execution callback so local/in-memory
  resumes (and human-approval) no longer drop their signal

Tests: callSubAgent executor unit tests, tryResumeParentFromAsyncTool
barrier/CAS unit tests, and a server suspend/resume integration test.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The async sub-agent resume reuses the SAME operationId, but dispatchHooks
fired onComplete and unregistered all hooks on every non-continue step —
including the waiting_for_async_tool park. That made completion consumers
(webhooks, bot promises, eval snapshots) fire prematurely on the park and
miss the real terminal state after resume.

For waiting_for_async_tool, persist the parked status (the resume CAS reads
it) but skip onComplete and keep hooks registered, so the eventual resume
under the same op still notifies consumers. waiting_for_human is unchanged
(its resume runs under a new operationId).

Found via the server-subagent agent-eval (real LLM, in-memory runtime):
parent now correctly reaches `done` after the sub-op completes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
QStashQueueServiceImpl nests resume/intervention fields under `body.payload`
(operationId/stepIndex/context stay top-level), but the runStep handler
destructured them from the top level. In production/QStash the resumed step
therefore saw `resumeAsyncTool` (and approvedToolCall/toolMessageId/…) as
undefined and never ran the waiting_for_async_tool DB-refresh/clear-pending
branch — the parent op would stay parked forever. The local queue spreads
payload itself, which masked this in local/eval runs.

Merge `body.payload` over the top-level body so both shapes work. Adds a
handler test asserting the QStash-nested payload reaches executeStep.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
When a server callSubAgent child op fails to start, no completion bridge
ever fires, so the parent stayed parked in `waiting_for_async_tool`
forever. The runner now drops the placeholder and signals `started:false`
so callSubAgent surfaces an inline tool error instead of parking the
parent — the batch continues (or parks only for genuinely-deferred
siblings, whose barrier already counts this error result).

Also:
- add isParkedStatus/isBlockedStatus to the @lobechat/agent-runtime test
  mock — persistCompletion/getOperationStatus call isParkedStatus, so the
  missing export crashed dispatchHooks (swallowing onComplete) and
  getOperationStatus, failing 3 AgentRuntimeService tests.
- fix completion-bridge totalToolCalls path (finalState.session.toolCalls
  → finalState.usage.tools.totalCalls; the former never existed).
- remove dead AgentOperationModel.queryByParentOperationId (zero callers).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@arvinxx arvinxx merged commit 04700be into canary Jun 6, 2026
33 of 35 checks passed
@arvinxx arvinxx deleted the feat/lobe-9763-subagent-bridge branch June 6, 2026 14:46
arvinxx added a commit that referenced this pull request Jun 10, 2026
# 🚀 LobeHub Release (20260610)

**Release Date:** June 10, 2026  
**Since v2.2.2:** 131 merged PRs · 13 contributors

> This weekly release strengthens agent collaboration across cloud,
desktop, CLI, and workspace flows, with steadier runtime behavior and a
broader foundation for workspace-scoped data.

---

## ✨ Highlights

- **Agent execution across devices** — Unifies per-device working
directories, project skill discovery, and sub-agent suspend/resume
behavior across server, QStash, and device RPC flows. (#15543, #15566,
#15481, #15620, #15591)
- **Connector and sandbox platform** — Expands connector permissions,
custom OAuth MCP connector onboarding, sandbox provider support, and
user-uploaded file sync into cloud sandbox runs. (#15463, #15546,
#15184, #15550)
- **Desktop and CLI reliability** — Fixes desktop cold-start,
auto-update, Windows build, CLI skill discovery, and `lh connect` agent
dispatch paths. (#15547, #15525, #15527, #15562, #15632, #15634)
- **Pages and sharing** — Refreshes topic sharing, improves Page Editor
layout behavior, and routes Page Agent tool execution through the
server-side editor path. (#15581, #15556, #15588, #15023, #15610)
- **Model availability and provider updates** — Adds user-scoped LobeHub
model availability, Claude Fable 5, Qwen thinking preservation, and
MiniMax M3 updates. (#15590, #15639, #13494, #15376)

---

## 🏗️ Core Product & Architecture

### Agent Runtime & Heterogeneous Agents

- Improves sub-agent lifecycle handling, including async suspend/resume,
queue-mode QStash resume delivery, and blocking nested sub-agent calls.
(#15481, #15620, #15575)
- Stabilizes heterogeneous agent ingestion and streaming with raw stream
dumps, per-turn usage, image forwarding on regenerate, and
duplicate-text fixes. (#15602, #15577, #15592, #15585)
- Adds execution-device and working-directory controls across device
RPC, legacy defaults, and remote-spawned Claude Code sessions. (#15543,
#15566, #15591, #15572)
- Improves runtime diagnostics and compatibility, including Gemini
multimodal output capture, abort stream semantics, and trace quality
analysis. (#15535, #13677, #15508)

---

## 📱 Platforms, Integrations & UX

### Connectors, Sandbox & Tools

- Ships API-level connector tool permissions, custom OAuth MCP connector
onboarding, and connector-first runtime execution. (#15463, #15546)
- Adds sandbox provider support, cloud sandbox file sync, and safer
external URL file input handling with SSRF validation. (#15184, #15550,
#12657)
- Improves tool visibility and execution with pinned app-fixed tools,
ANSI output rendering, gateway-tunneled MCP calls, and automatic
headless tool runs. (#15509, #15516, #15469, #15492)

### Desktop, CLI & Web UX

- Restores desktop startup and reload behavior, preserves IPC error
causes, and keeps the tab bar new-tab action visible across routes.
(#15547, #15597, #15638)
- Fixes desktop update and build stability for browser quit guards,
macOS update signing, and Windows Visual Studio detection. (#15525,
#15527, #15562)
- Shows the plan-limit upgrade UI on desktop builds. (#15628)
- Adds the Agent Run delivery checker and fixes CLI device dispatch plus
skill list/search output. (#15489, #15634, #15632)
- Refreshes onboarding, auth source preservation, topic UI states,
referral/Fable campaign copy, and chat-input control bar behavior.
(#15629, #15544, #15573, #15614, #15616, #15617, #15622, #15643)

---

## 🔒 Security, Reliability & Rollout Notes

- External URL file input now includes SSRF validation for safer Google
file handling. (#12657)
- Database workspace-scope migrations are part of this release;
self-hosted operators should run the normal migration path before
serving the updated app. (#15446, #15465, #15468, #15472)
- The release branch was re-cut from `canary` and includes the latest
`main` release-version commit so `v2.2.2` is the verified compare base.

---

## 👥 Contributors

@ONLY-yours, @sxjeru, @hardy-one, @xujingli, @hezhijie0327, @Coooolfan,
@arvinxx, @tjx666, @Innei, @rivertwilight, @rdmclin2, @cy948,
@AmAzing129

**Full Changelog**:
v2.2.2...release/weekly-20260610-recut-3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature:agent Assistant/Agent configuration and behavior feature:tool Tool calling and function execution size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant