✨ feat(agent-runtime): server callSubAgent async suspend/resume#15481
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2e35104d51
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| approvedToolCall, | ||
| rejectionReason, | ||
| rejectAndContinue, | ||
| resumeAsyncTool, |
There was a problem hiding this comment.
Read async resume flags from queued payload
In the QStash path I checked (src/server/services/queue/impls/qstash.ts), scheduleMessage publishes extra fields under body.payload, but this handler only destructures resumeAsyncTool from the top-level body. The new sub-agent bridge schedules the parent resume as payload: { resumeAsyncTool: true }, so in production/QStash the resumed step receives resumeAsyncTool === undefined and never runs the new DB-refresh/clear-pending branch for a waiting_for_async_tool parent. Local queue was updated to spread payload, which masks this in local tests; the HTTP handler needs to unwrap/spread body.payload too.
Useful? React with 👍 / 👎.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## canary #15481 +/- ##
==========================================
+ Coverage 70.64% 70.69% +0.05%
==========================================
Files 3274 3275 +1
Lines 322959 323290 +331
Branches 29419 34247 +4828
==========================================
+ Hits 228155 228563 +408
+ Misses 94621 94544 -77
Partials 183 183
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
…ferred tools Add a dedicated `waiting_for_async_tool` operation status that mirrors `waiting_for_human` as a non-terminal, resumable pause, and migrate the client-tool execution pause off `interrupted` onto it — so `interrupted` once again means only user-initiated cancellation. Also add the AgentOperationModel primitives the upcoming server sub-agent bridge needs: queryByParentOperationId (reconcile child ops) and tryResumeFromAsyncTool (atomic single-fire CAS). Foundation for the server sub-agent suspend/resume mechanism (LOBE-9763). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…predicates Replace the repeated `status === 'waiting_for_human' || ... === 'waiting_for_async_tool' || ... === 'interrupted'` chains with named predicates so the parked/blocked semantics live in one place (runtime step-loop break, completion lifecycle completedAt, executeSync pause, operation isActive). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Full rename of the service method, its `ExecSubAgentTaskParams`/`ExecSubAgentTaskResult` types, the tRPC endpoint, the injected `RuntimeExecutorContext`/`AgentRuntimeServiceOptions` callback, and tests. Group-mode `execGroupSubAgent*` identifiers are intentionally left untouched. Prep for the server sub-agent suspend/resume work (LOBE-9763). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This reverts commit f1ea407.
Introduce a generic `deferred` result flag (BuiltinServerRuntimeOutput / ToolExecutionResult). When a tool returns deferred, call_tool parks the operation (waiting_for_async_tool + pendingToolsCalling) without writing a tool_result — mirroring the client-tool pause — so the result can be delivered out-of-band later by a completion bridge. Thread the existing execSubAgentTask DI seam into ToolExecutionContext so async tools can spawn a child op without a circular import. Part of the server sub-agent suspend/resume mechanism (LOBE-9763). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Mirror the call_tool deferred-park on the parallel path: deferred (async) tools are collected during the concurrent batch and, once server tools settle, the operation parks (waiting_for_async_tool + pendingToolsCalling) alongside any client tools — so K parallel sub-agents in one round all resolve before the parent resumes. Part of the server sub-agent suspend/resume mechanism (LOBE-9763). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Turn the server `callSubAgent` path from fire-and-forget into a real deferred-tool suspend/resume loop (LOBE-9763 Phase 2): - lobeAgent server runtime: add `callSubAgent` executor returning a `deferred` result via an injected `ctx.subAgent` runner - RuntimeExecutors: build a per-tool-call server sub-agent runner that creates the pending placeholder tool message (anchoring the isolation thread) and kicks off the child op - aiAgent.execSubAgentTask: register an onComplete bridge hook that backfills the placeholder and resumes the parent - AgentRuntimeService: `tryResumeParentFromAsyncTool` (barrier over pendingToolsCalling + single-fire CAS + schedule), `refreshMessagesFromDB`, and the `resumeAsyncTool` branch in executeStep - queue/local: forward `payload` to the execution callback so local/in-memory resumes (and human-approval) no longer drop their signal Tests: callSubAgent executor unit tests, tryResumeParentFromAsyncTool barrier/CAS unit tests, and a server suspend/resume integration test. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The async sub-agent resume reuses the SAME operationId, but dispatchHooks fired onComplete and unregistered all hooks on every non-continue step — including the waiting_for_async_tool park. That made completion consumers (webhooks, bot promises, eval snapshots) fire prematurely on the park and miss the real terminal state after resume. For waiting_for_async_tool, persist the parked status (the resume CAS reads it) but skip onComplete and keep hooks registered, so the eventual resume under the same op still notifies consumers. waiting_for_human is unchanged (its resume runs under a new operationId). Found via the server-subagent agent-eval (real LLM, in-memory runtime): parent now correctly reaches `done` after the sub-op completes. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
QStashQueueServiceImpl nests resume/intervention fields under `body.payload` (operationId/stepIndex/context stay top-level), but the runStep handler destructured them from the top level. In production/QStash the resumed step therefore saw `resumeAsyncTool` (and approvedToolCall/toolMessageId/…) as undefined and never ran the waiting_for_async_tool DB-refresh/clear-pending branch — the parent op would stay parked forever. The local queue spreads payload itself, which masked this in local/eval runs. Merge `body.payload` over the top-level body so both shapes work. Adds a handler test asserting the QStash-nested payload reaches executeStep. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
d2a227f to
2335c4e
Compare
When a server callSubAgent child op fails to start, no completion bridge ever fires, so the parent stayed parked in `waiting_for_async_tool` forever. The runner now drops the placeholder and signals `started:false` so callSubAgent surfaces an inline tool error instead of parking the parent — the batch continues (or parks only for genuinely-deferred siblings, whose barrier already counts this error result). Also: - add isParkedStatus/isBlockedStatus to the @lobechat/agent-runtime test mock — persistCompletion/getOperationStatus call isParkedStatus, so the missing export crashed dispatchHooks (swallowing onComplete) and getOperationStatus, failing 3 AgentRuntimeService tests. - fix completion-bridge totalToolCalls path (finalState.session.toolCalls → finalState.usage.tools.totalCalls; the former never existed). - remove dead AgentOperationModel.queryByParentOperationId (zero callers). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
# 🚀 LobeHub Release (20260610) **Release Date:** June 10, 2026 **Since v2.2.2:** 131 merged PRs · 13 contributors > This weekly release strengthens agent collaboration across cloud, desktop, CLI, and workspace flows, with steadier runtime behavior and a broader foundation for workspace-scoped data. --- ## ✨ Highlights - **Agent execution across devices** — Unifies per-device working directories, project skill discovery, and sub-agent suspend/resume behavior across server, QStash, and device RPC flows. (#15543, #15566, #15481, #15620, #15591) - **Connector and sandbox platform** — Expands connector permissions, custom OAuth MCP connector onboarding, sandbox provider support, and user-uploaded file sync into cloud sandbox runs. (#15463, #15546, #15184, #15550) - **Desktop and CLI reliability** — Fixes desktop cold-start, auto-update, Windows build, CLI skill discovery, and `lh connect` agent dispatch paths. (#15547, #15525, #15527, #15562, #15632, #15634) - **Pages and sharing** — Refreshes topic sharing, improves Page Editor layout behavior, and routes Page Agent tool execution through the server-side editor path. (#15581, #15556, #15588, #15023, #15610) - **Model availability and provider updates** — Adds user-scoped LobeHub model availability, Claude Fable 5, Qwen thinking preservation, and MiniMax M3 updates. (#15590, #15639, #13494, #15376) --- ## 🏗️ Core Product & Architecture ### Agent Runtime & Heterogeneous Agents - Improves sub-agent lifecycle handling, including async suspend/resume, queue-mode QStash resume delivery, and blocking nested sub-agent calls. (#15481, #15620, #15575) - Stabilizes heterogeneous agent ingestion and streaming with raw stream dumps, per-turn usage, image forwarding on regenerate, and duplicate-text fixes. (#15602, #15577, #15592, #15585) - Adds execution-device and working-directory controls across device RPC, legacy defaults, and remote-spawned Claude Code sessions. (#15543, #15566, #15591, #15572) - Improves runtime diagnostics and compatibility, including Gemini multimodal output capture, abort stream semantics, and trace quality analysis. (#15535, #13677, #15508) --- ## 📱 Platforms, Integrations & UX ### Connectors, Sandbox & Tools - Ships API-level connector tool permissions, custom OAuth MCP connector onboarding, and connector-first runtime execution. (#15463, #15546) - Adds sandbox provider support, cloud sandbox file sync, and safer external URL file input handling with SSRF validation. (#15184, #15550, #12657) - Improves tool visibility and execution with pinned app-fixed tools, ANSI output rendering, gateway-tunneled MCP calls, and automatic headless tool runs. (#15509, #15516, #15469, #15492) ### Desktop, CLI & Web UX - Restores desktop startup and reload behavior, preserves IPC error causes, and keeps the tab bar new-tab action visible across routes. (#15547, #15597, #15638) - Fixes desktop update and build stability for browser quit guards, macOS update signing, and Windows Visual Studio detection. (#15525, #15527, #15562) - Shows the plan-limit upgrade UI on desktop builds. (#15628) - Adds the Agent Run delivery checker and fixes CLI device dispatch plus skill list/search output. (#15489, #15634, #15632) - Refreshes onboarding, auth source preservation, topic UI states, referral/Fable campaign copy, and chat-input control bar behavior. (#15629, #15544, #15573, #15614, #15616, #15617, #15622, #15643) --- ## 🔒 Security, Reliability & Rollout Notes - External URL file input now includes SSRF validation for safer Google file handling. (#12657) - Database workspace-scope migrations are part of this release; self-hosted operators should run the normal migration path before serving the updated app. (#15446, #15465, #15468, #15472) - The release branch was re-cut from `canary` and includes the latest `main` release-version commit so `v2.2.2` is the verified compare base. --- ## 👥 Contributors @ONLY-yours, @sxjeru, @hardy-one, @xujingli, @hezhijie0327, @Coooolfan, @arvinxx, @tjx666, @Innei, @rivertwilight, @rdmclin2, @cy948, @AmAzing129 **Full Changelog**: v2.2.2...release/weekly-20260610-recut-3
Summary
Implements server-side
callSubAgentas a real async suspend/resume loop (LOBE-9763). Previously the server sub-agent path was fire-and-forget — the parent dispatched a child op and continued immediately with a "dispatched" acknowledgement, never seeing the sub-agent's actual answer. Now the parent parks (waiting_for_async_tool, no request held), the child op runs independently (QStash/local), and a completion bridge backfills the parent's tool message and resumes the parent so the LLM continues with the real result.Key changes
waiting_for_async_toolparked state (Phase 1 foundation): new non-terminal status +isParkedStatus/isBlockedStatus;client_tool_executionmigrated offinterrupted.RuntimeExecutors(call_tool/call_tools_batch).lobeAgentserver runtimecallSubAgentexecutor → returnsdeferred.ctx.subAgentrunner injected per tool-call (owns the parent-message anchor + child-op kickoff).aiAgent.execSubAgentTaskonComplete) → backfill +AgentRuntimeService.tryResumeParentFromAsyncTool(barrier + single-fire CAS + schedule) +resumeAsyncToolre-entry inexecuteStep.CompletionLifecycle.dispatchHookspreviously firedonComplete+ unregistered all hooks on the park; since the async resume reuses the same operationId, the realdonenever re-notified consumers. Nowwaiting_for_async_toolpersists the parked status but keeps hooks registered and skips the prematureonComplete.payloadto the execution callback (parity with the QStash body path).Verification
lobeAgent.callSubAgent(deferred contract),tryResumeParentFromAsyncTool(barrier / CAS / no-double-resume),CompletionLifecycleasync-park hook behavior.serverSubAgent.integration.test.ts(in-memory runtime; runs in submodule CI).agent-evals(real tool exec, in-memory runtime): parent parks → child runs → bridge CAS resume →done, asserted on the backfilled tool message. Real-QStash cross-process run pending (separate cloud tooling PR adds an eval--queuemode for it).Linear: LOBE-9763