Skip to content

✨ feat(agent-runtime): server-side human approval flow (resume approve / reject / reject_continue)#13829

Merged
arvinxx merged 4 commits into
canaryfrom
feat/server-human-approval
Apr 15, 2026
Merged

✨ feat(agent-runtime): server-side human approval flow (resume approve / reject / reject_continue)#13829
arvinxx merged 4 commits into
canaryfrom
feat/server-human-approval

Conversation

@arvinxx

@arvinxx arvinxx commented Apr 14, 2026

Copy link
Copy Markdown
Member

Summary

Implements the server-side human approval flow so execServerAgentRuntime can correctly pause on waiting_for_human and resume on approve / reject / reject_continue. Mirrors the client-mode executors one-to-one — no new design.

Fixes LOBE-7151

What changes

  • request_human_approve executor (src/server/modules/AgentRuntime/RuntimeExecutors.ts):
    • Creates one role='tool' message per pending tool call with pluginIntervention: { status: 'pending' }
    • Honors skipCreateToolMessage (resumption path): looks up existing tool messages by tool_call_id instead
    • Ships the { toolCallId → toolMessageId } map on the tools_calling stream chunk so clients can render approval UI without waiting for agent_runtime_end
  • call_tool executor (same file):
    • Adds a skipCreateToolMessage branch → updateToolMessage on the pre-existing pending row instead of inserting a new one
    • Prevents duplicate tool_call_id rows and parent_id FK violations on approve (directly eliminates the LOBE-7154 class of errors on the approval path)
  • AgentRuntimeService.handleHumanIntervention (src/server/services/agentRuntime/AgentRuntimeService.ts):
    • approve → persist intervention=approved + return phase: 'human_approved_tool' with skipCreateToolMessage: true
    • reject (pure) → persist intervention=rejected + transition to interrupted with reason='human_rejected'
    • reject_continue → persist intervention=rejected + return phase: 'user_input' (LLM sees the rejection as user feedback)
    • Batch pending support: keeps waiting_for_human until the last tool is resolved
  • Wire-through: toolMessageId and rejectAndContinue now flow from the tRPC schema (aiAgent.processHumanIntervention) through QStash payload, /run route body, AgentExecutionParams, and executeStep into handleHumanIntervention. A new action reject_continue distinguishes B from C cleanly.

What's NOT in this PR

The frontend pieces (conversationControl.ts server-mode branches, gatewayEventHandler chunk consumption, gateway.ts operation metadata) will land in a follow-up PR so this change can be reviewed and shipped as a self-contained backend increment. The backend wire is backward compatible — legacy callers that don't send toolMessageId no-op cleanly inside the handler.

Test plan

  • bunx vitest run --silent=passed-only src/server/modules/AgentRuntime/__tests__/RuntimeExecutors.test.ts — 82 tests pass (12 new across request_human_approve + call_tool skip branch)
  • bunx vitest run --silent=passed-only src/server/services/agentRuntime/__tests__/handleHumanIntervention.test.ts — 18 new tests pass (approve / pure reject / reject_continue / no-op paths)
  • bunx vitest run --silent=passed-only src/server/services/agentRuntime/__tests__/executeStep.test.ts — 18 tests pass (regression check on transit)
  • bun run type-check — clean for this diff (remaining errors are pre-existing: @dnd-kit, hono, larksuite, workflow memory routes)
  • Integration smoke: trigger a tool requiring approval in a server-mode session, confirm pending row created in DB, approve via tRPC, confirm tool executes and call_tool updates rather than inserts a second row

🤖 Generated with Claude Code

@vercel

vercel Bot commented Apr 14, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
lobehub Ready Ready Preview, Comment Apr 15, 2026 2:43am

Request Review

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've reviewed this pull request using the Sourcery rules engine

@codecov

codecov Bot commented Apr 14, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 84.16290% with 35 lines in your changes missing coverage. Please review.
✅ Project coverage is 66.76%. Comparing base (1a98e1b) to head (50cec0c).
⚠️ Report is 2 commits behind head on canary.

Additional details and impacted files
@@            Coverage Diff             @@
##           canary   #13829      +/-   ##
==========================================
+ Coverage   66.70%   66.76%   +0.06%     
==========================================
  Files        2031     2032       +1     
  Lines      172508   172782     +274     
  Branches    20197    17734    -2463     
==========================================
+ Hits       115064   115360     +296     
+ Misses      57320    57298      -22     
  Partials      124      124              
Flag Coverage Δ
app 58.98% <84.16%> (+0.10%) ⬆️
database 92.46% <ø> (ø)
packages/agent-runtime 79.72% <ø> (ø)
packages/context-engine 83.38% <ø> (ø)
packages/conversation-flow 92.36% <ø> (ø)
packages/file-loaders 87.02% <ø> (ø)
packages/memory-user-memory 74.74% <ø> (ø)
packages/model-bank 99.86% <ø> (ø)
packages/model-runtime 84.20% <ø> (ø)
packages/prompts 69.24% <ø> (ø)
packages/python-interpreter 92.90% <ø> (ø)
packages/ssrf-safe-fetch 0.00% <ø> (ø)
packages/utils 90.34% <ø> (ø)
packages/web-crawler 88.66% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
Store 66.00% <ø> (ø)
Services 52.19% <ø> (ø)
Server 66.78% <87.12%> (+0.24%) ⬆️
Libs 52.89% <ø> (ø)
Utils 91.12% <ø> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 517504e360

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/server/services/agentRuntime/AgentRuntimeService.ts Outdated
Comment thread src/server/modules/AgentRuntime/RuntimeExecutors.ts Outdated
return { newState: state, nextContext: undefined };
}

const rejectionContent = rejectionReason
arvinxx and others added 2 commits April 15, 2026 10:03
Port the client-mode human approval executors (request_human_approve,
call_tool resumption, handleHumanIntervention) to the server agent
runtime so that execServerAgentRuntime can correctly pause on
waiting_for_human and resume on approve / reject / reject_continue.

- request_human_approve now creates one `role='tool'` message per pending
  tool call with `pluginIntervention: { status: 'pending' }` and ships
  the `{ toolCallId → toolMessageId }` mapping on the `tools_calling`
  stream chunk.
- call_tool gains a `skipCreateToolMessage` branch that updates the
  pre-existing tool message in-place (prevents duplicate rows / parent_id
  FK violations that show up as LOBE-7154 errors).
- AgentRuntimeService.handleHumanIntervention implements all three
  paths: approve → `phase: 'human_approved_tool'`; reject → interrupted
  with `reason: 'human_rejected'`; reject_continue → `phase: 'user_input'`.
- ProcessHumanIntervention schema carries `toolMessageId` and a new
  `reject_continue` action; schema remains permissive (handler no-ops on
  missing toolMessageId) to keep legacy callers working.

Fixes LOBE-7151

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…2 duplicate tool msg)

P1 — reject_continue with remaining pending tools must NOT resume the LLM.
Previously `handleHumanIntervention` kept `status='waiting_for_human'` but
returned `nextContext: { phase: 'user_input' }`, which `executeStep` would
hand to `runtime.step` immediately, breaking batch semantics. Now when
other tools are still pending, the rejection is persisted but no context
is returned; the `user_input` continuation only fires when this is the
last pending tool.

P2 — request_human_approve was pushing an empty placeholder
`{ role: 'tool', tool_call_id, content: '' }` into `newState.messages`
to "reflect" the newly-created pending DB row. On resume, the `call_tool`
skip-create path appends the real tool result, leaving two entries for
the same `tool_call_id` in runtime state. The downstream short-circuit
(`phase=human_approved_tool` → `call_tool`) doesn't consult
state.messages, so the placeholder was unused cost. Removed.

Also fixes a TS 2339 in the skipCreateToolMessage test where
`nextContext.payload` is typed `{}` and needed an explicit cast.

Tests: 99 pass (82 RuntimeExecutors + 17 handleHumanIntervention), type-check clean.
Verified end-to-end via the human-approval eval — it now exercises a
multi-turn retry path (LLM calls the gated tool twice) and both
approvals resolve cleanly through to `completionReason=done`.

Relates to LOBE-7151

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…-pdf/svg

@react-pdf/image@3.1.0 (auto-resolved via layout@4.6.0 ← renderer@4.4.1)
declares `@react-pdf/svg@^1.1.0` as a dependency, but the svg package was
unpublished/made private on npm (returns 404). CI installs blow up with
ERR_PNPM_FETCH_404.

Upstream issue: diegomura/react-pdf#3377

Pin image to 3.0.4 (the last release before the broken svg dep was
introduced) via pnpm.overrides until react-pdf publishes a fix.
@arvinxx arvinxx merged commit 9f61b58 into canary Apr 15, 2026
34 checks passed
@arvinxx arvinxx deleted the feat/server-human-approval branch April 15, 2026 03:07
canisminor1990 added a commit that referenced this pull request Apr 16, 2026
# 🚀 LobeHub v2.1.50 (20260416)

**Release Date:** April 16, 2026\
**Since v2.1.49:** 107 commits · 101 merged PRs · 13 contributors

> This weekly release focuses on improving runtime stability and gateway
execution consistency, while making Home/Recents workflows faster to
navigate and easier to manage in daily use.

---

## ✨ Highlights

- **Server-side Human Approval Flow** — Agent runtime now supports more
reliable approve/reject/reject-continue handling in gateway mode,
reducing stalled execution paths in long-running tasks. (#13829, #13863,
#13873)

- **Message Gateway End-to-End Hardening** — Gateway message flow, queue
handling, tool callback routing, and stop interruption behavior were
strengthened for better execution continuity. (#13761, #13816, #13820,
#13815)

- **Client Tool Execution in Gateway Mode** — Client-executor tools now
run more predictably across gateway and desktop callers, with improved
executor dispatch behavior. (#13792, #13790)

- **Home / Recents / Sidebar Upgrade** — Sidebar layout, custom sort,
recents operations, and profile actions were improved to reduce
navigation friction in active sessions. (#13719, #13812, #13723, #13739,
#13878, #13734)

- **Agent Workspace and Documents Expansion** — Working panel and agent
document workflows were expanded and polished for better day-to-day
agent operations. (#13766, #13857)

- **Provider and Model Compatibility Improvements** — Added GLM-5.1
support and refined model/provider edge-case handling, including schema
and error-path fixes. (#13757, #13806, #13736, #13740)

---

## 🏗️ Core Agent & Architecture

### Agent runtime and intervention lifecycle

- Added server-side human approval and improved runtime coordination
across approve/reject decision paths. (#13829, #13863)
- Improved interrupted-task handling and operation lifecycle consistency
to reduce half-finished runtime states. (#13714)
- Refined error classification and payload propagation so downstream
surfaces receive clearer actionable errors. (#13736, #13740)

### Execution model and dispatch behavior

- Introduced executor-aware runtime behavior to better separate
client/server tool execution semantics. (#13758)
- Improved tool/plugin resolution and manifest handling to avoid runtime
failures on malformed inputs. (#13856, #13840, #13807)

---

## 📱 Gateway & Platform Integrations

- Added message gateway support and strengthened queue/error behavior
for more stable cross-channel execution. (#13761, #13816, #13820)
- Improved gateway callback pipeline with protocol and API additions for
`tool_execute` / `tool_result`. (#13762, #13764, #13765)
- Improved bot/channel reliability and DM/slash handling in
Discord-related paths. (#13805, #13724)

---

## 🖥️ CLI & User Experience

- Improved CLI reliability across message/topic operations and
build/minify-related paths. (#13731, #13888)
- Added image-to-video options and improved command behavior for
generation workflows. (#13788)
- Improved desktop runtime behavior for remote fetch and Linux
notification urgency handling. (#13789, #13782)

---

## 🔧 Tooling

- Extracted gateway stream client into `@lobechat/agent-gateway-client`
to centralize protocol usage and reduce duplication. (#13866)
- Improved built-in tool coverage and runtime support, including GTD
server runtime and missing lobe-kb tools. (#13854, #13876)
- Updated skill and frontmatter consistency in workflow tooling.
(#13730)

---

## 🔒 Security & Reliability

- **Security:** Strengthened API key WS auth behavior and safer
serverUrl forwarding in gateway-related auth paths. (#13824)
- **Reliability:** Reduced runtime stalls by improving gateway
stop/interrupt and approval-state routing behavior. (#13815, #13863,
#13873)
- **Reliability:** Added defensive guards for malformed tool manifests
and non-string content edge cases. (#13856, #13753)

---

## 👥 Contributors

**101 merged PRs** from **13 contributors** across **107 commits**.

### Community Contributors

- @arvinxx - Runtime, gateway, and execution reliability improvements
- @Innei - Navigation, workflow UX, and desktop/CLI refinements
- @rdmclin2 - Sidebar, recents, and channel behavior updates
- @ONLY-yours - Tooling/runtime fixes and model execution compatibility
- @tjx666 - Model support and release/tooling maintenance
- @nekomeowww - Memory and search-path stability fixes
- @cy948 - CLI indexing and command flow fixes
- @octo-patch - Local system runtime edge-case fixes
- @djthread - Desktop runtime request reliability improvements
- @rivertwilight - Documentation and changelog updates
- @sudongyuer - Subscription/mobile support improvements
- @Zhouguanyang - Provider/model configuration correctness fixes
- @lobehubbot - Translation and maintenance automation support

---

**Full Changelog**: v2.1.49...v2.1.50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant