🐛 fix(agent-runtime): sanitize invalid tool_call arguments to unbreak strict providers by arvinxx · Pull Request #14033 · lobehub/lobehub

arvinxx · 2026-04-22T05:24:05Z

Summary

Prevents a class of op terminations where a single malformed tool_calls[].arguments JSON string (e.g. Qwen emitting {, \"description\": ...}) pollutes messages.tools[] and causes strict providers like NVIDIA NIM to 400 the entire history-bearing request on every subsequent turn, wasting all accumulated tokens.

A new shared helper sanitizeToolCallArguments in @lobechat/utils is wired in at three layers:

Server entry — src/server/modules/AgentRuntime/RuntimeExecutors.ts onToolsCalling. Mirrors the frontend's internal_transformToolCalls so nothing new gets persisted with invalid JSON.
Outbound context build — packages/context-engine/src/processors/ToolCall.ts. Last line of defense for messages already persisted in DB before this fix.
Agent-runtime core — packages/agent-runtime/src/core/runtime.ts's call_tools_batch normalization path. Covers the old-format ToolsCalling[] branch.

Behavior

Input	Output	Rationale
Valid JSON	unchanged	Preserves prompt-cache keys
Truncated JSON (recoverable via `partial-json`)	re-stringified	Recovers as many fields as possible
Unrecoverable garbage	`\"{}\"`	Keeps `tool_call` structure; lets next turn replan (方案 A from the original issue)

Fixes LOBE-7761
Related LOBE-7763 (调研 sub-issue documenting the frontend-vs-backend gap)

Test plan

New unit tests for the helper (9 cases): valid passthrough, whitespace preservation, LOBE-7761 exact shape, truncated stream recovery, idempotence, scalar JSON handling
ToolCallProcessor integration tests: poisoned tools[].arguments → \"{}\" on outbound; valid args pass through unchanged
AgentRuntime normalization test: old-format ToolsCalling[] with invalid JSON gets normalized to \"{}\" before reaching the executor
bun run type-check — zero new errors introduced (44 pre-existing, 44 after)
packages/utils tests: 9/9 pass
packages/context-engine/ToolCall tests: 25/25 pass (+2 new)
packages/agent-runtime/runtime tests: 50/50 pass (+1 new)
Manual: replay LOBE-7761 trace op_1776214671582_agt_Sg0mZYSZga5P_tpc_6OEij9a8ZxC2_Nv3Vme7z end-to-end against NVIDIA NIM once deployed

🤖 Generated with Claude Code

… history poisoning When a model emits malformed JSON as tool_calls[].arguments (e.g. Qwen producing `{, "description": ...}`), the raw string was persisted to `messages.tools[].arguments` and replayed verbatim on every subsequent turn. Strict providers (NVIDIA NIM) validate the full history and 400 the whole request, terminating the op and wasting all accumulated tokens. Add a shared `sanitizeToolCallArguments` helper in @lobechat/utils and wire it in at three layers so both new captures and already-poisoned DB history are safe: - Server entry (RuntimeExecutors onToolsCalling) — mirrors the frontend's `internal_transformToolCalls` pattern; prevents new poisoning. - Outbound context build (ToolCallProcessor) — last line of defense for historical messages that were persisted before this fix. - Agent-runtime core (call_tools_batch normalization) — covers the old-format ToolsCalling[] path. Behavior: valid JSON passes through unchanged (prompt cache stable); partial-json recovers truncated streams; unrecoverable payloads fall back to "{}" so the tool_call structure survives and the model can replan on the next turn. Fixes LOBE-7761 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

vercel · 2026-04-22T05:24:10Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
lobehub	Ready	Preview, Comment	Apr 22, 2026 6:47am

sourcery-ai

We've reviewed this pull request using the Sourcery rules engine

codecov · 2026-04-22T05:29:08Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 66.93%. Comparing base (b02b727) to head (683f831).
⚠️ Report is 11 commits behind head on canary.

Additional details and impacted files

@@            Coverage Diff             @@
##           canary   #14033      +/-   ##
==========================================
+ Coverage   66.86%   66.93%   +0.07%     
==========================================
  Files        2100     2104       +4     
  Lines      179535   179854     +319     
  Branches    21192    21254      +62     
==========================================
+ Hits       120038   120392     +354     
+ Misses      59373    59338      -35     
  Partials      124      124

Flag	Coverage Δ
app	`59.60% <100.00%> (+0.11%)`	⬆️
database	`92.27% <ø> (ø)`
packages/agent-runtime	`79.72% <ø> (ø)`
packages/context-engine	`83.18% <100.00%> (+<0.01%)`	⬆️
packages/conversation-flow	`92.40% <ø> (ø)`
packages/file-loaders	`87.02% <ø> (ø)`
packages/memory-user-memory	`74.74% <ø> (ø)`
packages/model-bank	`99.86% <ø> (ø)`
packages/model-runtime	`84.23% <100.00%> (+0.01%)`	⬆️
packages/prompts	`69.08% <ø> (ø)`
packages/python-interpreter	`92.90% <ø> (ø)`
packages/ssrf-safe-fetch	`0.00% <ø> (ø)`
packages/utils	`88.40% <100.00%> (+0.44%)`	⬆️
packages/web-crawler	`88.66% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
Store	`66.60% <95.28%> (+0.22%)`	⬆️
Services	`51.71% <ø> (ø)`
Server	`66.82% <81.66%> (+0.04%)`	⬆️
Libs	`52.57% <ø> (ø)`
Utils	`80.59% <ø> (ø)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c07f4b90cf

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…anitizing Sanitizing `tool_calls[].arguments` at capture (onToolsCalling) was too early — the normalized "{}" reached `BuiltinToolsExecutor.execute` and bypassed the `INVALID_JSON_ARGUMENTS` branch, so the model got a generic "missing required field" error instead of the precise "your JSON syntax was broken, fix it" feedback. That regressed the self-reflection signal. Move sanitization to the persist boundaries only: - DB write via `messageModel.update({tools: ...})` - `state.messages` push for the assistant message's `tool_calls` The execution path keeps the raw `arguments` string so the executor can still emit its `INVALID_JSON_ARGUMENTS` tool-result with the original malformed payload echoed back — exactly the frontend-symmetric self- reflection flow. Add a regression test pinning the LOBE-7761 Qwen shape so future changes can't silently drop the feedback again. Fixes LOBE-7761 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…id undeclared @lobechat/utils dep Review flagged that `runtime.ts` imported `sanitizeToolCallArguments` from `@lobechat/utils` while `agent-runtime/package.json` doesn't list utils as a runtime dependency — in strict/hermetic installs this resolves to MODULE_NOT_FOUND before the runtime can start. Rather than add a new dep just for a belt-and-suspenders path, drop the sanitize on the old-format `call_tools_batch` normalization. The actual LOBE-7761 bug is server-side history poisoning; that's fully covered by: - RuntimeExecutors persist-boundary sanitize (DB write + state.messages) - context-engine ToolCallProcessor outbound sanitize (handles any DB history that was persisted before this fix) Old-format agents in agent-runtime don't persist or replay to providers on their own — sanitization is the consuming application's responsibility and can live closer to its persistence layer. Drops the dep-cycle-free path. Related LOBE-7761 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The assistant→Anthropic conversion was swallowing `JSON.parse` errors silently and falling back to empty `input: {}`. Combined with the LOBE-7761 fix, bad arguments should always be sanitized upstream in context-engine, so hitting this catch means something bypassed the defense and we're about to send a tool_use with empty input to Claude. That's worth knowing about. Match the `console.error('parse tool call arguments error:', ...)` pattern already used in openaiCompatibleFactory so logs are greppable. Related LOBE-7761 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@hardy

# 🚀 LobeHub v2.1.53 (20260427) **Release Date:** April 27, 2026 **Since v2.1.52:** 194 merged PRs · 17 contributors > Introduce Heterogeneous Agent — Claude Code and Codex run as first-class desktop runtimes, paired with a new Agent Signal package, sharper desktop UX, and a wave of flagship model additions. --- ## ✨ Highlights - **Introduce Heterogeneous Agent** — Claude Code and Codex run as first-class desktop agents: subagent rendering, partial-message streaming, multi-turn resume, terminal error surfacing, rich tool inspectors, and runtime polish. (#14162, #13754, #14067, #14001, #13970, #13942) - **Screen capture & Quick Chat tray** — New desktop screen capture overlay (macOS permission-gated) with Quick Chat tray and upload pipeline improvements; chat input auto-focuses on overlay mount. (#13818, #14097, #14105) - **Desktop topic & tab UX** — Dedicated topic popup window with cross-window sync, Cmd+W/Cmd+T tab shortcuts, TabBar polish, recent working directories expanded to 20, and human approval notifications. (#13957, #13983, #13972, #14036, #14092) - **Git workflow built-in** — One-click pull/push from the branch chip, ahead/behind badge, and submodule/worktree repo detection. (#14041, #13980, #13978) - **Agent Signal package** — New `@lobechat/agent-signal` runtime for dynamic memory feedback signals, with OTel metrics and self-iteration in Lab. (#14157, #14170, #14159, #14169, #14187) - **New models** — Claude Opus 4.7 with `xhigh` effort tier, GPT-5.5, DeepSeek V4 Flash/Pro with reasoning slider, Kimi K2.6, MiMo-V2.5/Pro, gpt-image-2, Qwen3.6 Flash/Plus, and Pixverse-c1. (#13903, #14147, #14114, #14004, #14089, #14039, #13923) - **New providers** — OpenCode Zen, OpenCode Go, and Azure OpenAI Router runtime. (#13943, #14064, #13823) - **Mobile settings overhaul** — Full settings menu and responsive profile layout for mobile. (#14019) --- ## 🏗️ Heterogeneous Agent - Claude Code runtime, working-directory awareness, and sidebar polish. (#13970) - CC subagent rendering with persistent streamed text; parallel-tool orphan fix. (#14001, #13968, #14024) - Per-step usage persisted to each step assistant message. (#13964) - Per-phase workflow expand defaults; full-expand toggle with three-level expansion. (#14171, #13906) - Hetero-mode actions bar; tool inspector polish. (#13963, #14034, #14030) - Codex desktop integration with rich tool rendering and devtools preview. (#14067, #14100) - Codex terminal error surfacing and CLI output tracing. (#14166) - Tighten `isCanUseVision` default and add aggregator fallback. (#14172) - Persist `ccSessionId` in topic metadata for CC multi-turn resume. (#13902) - CC account card, topic filter, and integration polish. (#13955, #13942, #13950) - Token-level deltas streamed via `--include-partial-messages`. (#13929) --- ## 🧠 Agent Signal & Self-Iteration - New `@lobechat/agent-signal` package with dynamic feedback signals. (#14157) - AgentSignalRuntime wired through agent-tracing and observability-otel metrics. (#14170, #14159) - Self-iteration feature flag added to Lab; front-side flag check. (#14169, #14186) - Signal policy for receiving memory feedback dynamically. (#14187) --- ## 💬 Conversation - Queue follow-up sends during running CC turns. (#14179) - Persist per-topic chat scroll position; pin user message + fold long messages. (#14191, #14056) - Inline resend when editing last user message. (#14080) - Disable first-block markdown streaming to prevent flicker. (#14193, #13904) - Prevent Markdown stream replay when vlist remounts streaming items. (#14086) - Stop repinning after manual scroll; unify scroll-to-user + spacer hooks. (#14099, #14132) --- ## 📱 Platforms & Integrations ### Desktop / Electron - Screen capture overlay, Quick Chat tray, and upload pipeline improvements. (#13818) - macOS permission gate for screen capture; auto-focus chat panel input. (#14097, #14105) - Dedicated topic popup window with cross-window sync. (#13957) - TabBar polish: `+` button for new topic, dark theme blend, close icon by default. (#13972, #14203, #13973) - Recent working directories expanded from 5 to 20; submodule/worktree repo detection. (#14036, #13978) - Cmd+W / Cmd+T tab shortcuts and global shortcut consolidation. (#13983, #13880) - Linux icon configuration; human approval desktop notifications. (#14042, #14092) ### Git Workflow - One-click pull/push from branch chip; ahead/behind badge with refactored GitCtr. (#14041, #13980) ### Mobile - Full settings menu and responsive profile layout. (#14019) - Agent route added to mobile router; mobile agent topic route registered. (#14103, #14158) - Session list skeleton row layout corrected. (#14040) ### Bot / Messaging - DM strategy support; bot emoji and markdown render optimization. (#14201, #14091, #14140) - Slack webhook fix; bot platform setup guide reference. (#14052, #14121) --- ## 🤖 Models & Providers ### New models - **Claude Opus 4.7** with `xhigh` effort tier; strip temperature/top_p. (#13903, #13909) - **GPT-5.5**. (#14147) - **DeepSeek V4** Flash/Pro cards with reasoning slider; cache-hit and Pro discount pricing. (#14114, #14209, #14196, #14131) - **Kimi K2.6** model with LobeHub-hosted card. (#14004, #14006) - **MiMo-V2.5 / V2.5-Pro**. (#14089) - **gpt-image-2**, **Qwen3.6 Flash/Plus**, **Pixverse-c1**. (#14039, #13923) ### New providers - **OpenCode Zen** and **OpenCode Go** with env-var support. (#13943, #14064) - **Azure OpenAI Router** runtime support. (#13823) - Model alias mapping for image and video runtimes. (#13896) - Seedance video models migrated to Dreamina. (#14144) ### Runtime reliability - Sanitize invalid tool_call arguments to unbreak strict providers. (#14033) - Tolerate null `function.name` in streaming tool_call deltas. (#14139) - Preserve Gemini 3 `thoughtSignature` in `call_tools_batch` normalization. (#14032) - Downgrade `image_url` parts when target model lacks vision. (#14029) - Preserve Cloudflare provider error context. (#14136) - Use `safety_identifier` for OpenAI Responses API. (#14148) - Unwrap underlying PG error in `formatErrorEventData`. (#14038) --- ## 🖥️ User Experience - **Onboarding** — Preset agent naming suggestions, structured hunk ops for `updateDocument`, persona analytics snapshot, footer promotion pipeline, wrap-up button. (#13931, #13989, #13930, #13853, #13934) - **Document workflow** — Agent documents promoted as primary workspace panel; history management and compare workflow; web-crawl docs associated with agent documents. (#13924, #13725, #13893) - **cmdk** — Agent identity surfaced on topic search results; topic/message search scoped to current agent. (#14204, #13960) - **Floating chat panel** and workspace improvements. (#13887) - **Topic completion status** with dropdown action and filter. (#14005) --- ## 🔧 Tooling - Redis-backed feature flag provider for runtime config. (#14098) - Vite upgraded to 8.0.0 with Rolldown strict execution order. (#12720, #14058) - `@lobechat/model-bank` automated npm release with provenance. (#14015, #14017, #14018) - Skill activation fallback when `activateTools` cannot find identifier. (#14010) - Cron tool: timezone and existing jobs injected into system prompt; clarified `lobe-gtd` and `lobe-cron` descriptions. (#14012, #14013) --- ## 🔒 Security & Reliability - **Security:** uuid bumped to v14 (advisory). (#14083) - **Security:** validate avatar URL and scope old-avatar deletion to owner. (#13982) - **Security:** clear OIDC sessions on better-auth signout; return 401 (not 500) for expired OIDC JWT. (#13916, #14014) - **Reliability:** scope pending-approval check to current assistant turn. (#14182) - **Reliability:** sanitize heterogeneous-agent attachment cache filenames. (#13937) - **Reliability:** reduce subagent task status error noise. (#14026) --- ## 👥 Contributors Huge thanks to **17 contributors** who shipped **194 merged PRs** this week. @hardy · @shaun0927 · @hezhijie0327 · @sxjeru · @arvinxx · @Innei · @tjx666 · @lijian · @neko · @rdmclin2 · @AmAzing129 · @sudongyuer · @CanisMinor · @rivertwilight Plus @lobehubbot and renovate[bot] for maintenance. --- **Full Changelog**: v2.1.52...v2.1.53

sourcery-ai Bot reviewed Apr 22, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Apr 22, 2026

View reviewed changes

Comment thread packages/agent-runtime/src/core/runtime.ts Outdated

vercel Bot deployed to Preview April 22, 2026 05:42 View deployment

vercel Bot deployed to Preview April 22, 2026 06:23 View deployment

vercel Bot deployed to Preview April 22, 2026 06:47 View deployment

arvinxx merged commit 6d339d6 into canary Apr 22, 2026
35 checks passed

arvinxx deleted the fix/lobe-7761-agent-gateway-invalid-tool-call-args branch April 22, 2026 08:09

obsidianstudiosX mentioned this pull request Apr 23, 2026

chore: upstream sync 2026-04-23 — 18 commits from lobehub/canary obsidianstudiosX/pantheon#2

Open

arvinxx mentioned this pull request Apr 27, 2026

🚀 release: 20260427 #14217

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🐛 fix(agent-runtime): sanitize invalid tool_call arguments to unbreak strict providers#14033

🐛 fix(agent-runtime): sanitize invalid tool_call arguments to unbreak strict providers#14033
arvinxx merged 4 commits into
canaryfrom
fix/lobe-7761-agent-gateway-invalid-tool-call-args

arvinxx commented Apr 22, 2026

Uh oh!

vercel Bot commented Apr 22, 2026 •

edited

Loading

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

codecov Bot commented Apr 22, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

arvinxx commented Apr 22, 2026

Summary

Behavior

Test plan

Uh oh!

vercel Bot commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented Apr 22, 2026 •

edited

Loading

codecov Bot commented Apr 22, 2026 •

edited

Loading