🐛 fix(agent-runtime): capture Gemini multimodal content_part/reasoning_part output#15535
Merged
Merged
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## canary #15535 +/- ##
=======================================
Coverage 70.71% 70.71%
=======================================
Files 3285 3285
Lines 324605 324692 +87
Branches 34484 34501 +17
=======================================
+ Hits 229549 229621 +72
- Misses 94873 94888 +15
Partials 183 183
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
…g_part output Gemini 2.5+/3 thinking streams deliver assistant text and reasoning as content_part/reasoning_part events instead of plain text/reasoning. The runtime registered no onContentPart/onReasoningPart handlers, so the text was silently dropped: onCompletion still reported usage tokens, the empty-completion guard saw outputTokens > 0, and the turn finalized to a blank `done` (lost in DB, client stream and trace alike). Add the two handlers, mirroring onText/onThinking for text parts so streaming, persistence and tracing all capture the content. Image parts are uploaded to object storage and serialized as multimodal content (text + image URLs, in order) — never persisting raw base64. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
b5ec8a5 to
a59f248
Compare
This was referenced Jun 9, 2026
Closed
Closed
Closed
Merged
arvinxx
added a commit
that referenced
this pull request
Jun 10, 2026
# 🚀 LobeHub Release (20260610) **Release Date:** June 10, 2026 **Since v2.2.2:** 131 merged PRs · 13 contributors > This weekly release strengthens agent collaboration across cloud, desktop, CLI, and workspace flows, with steadier runtime behavior and a broader foundation for workspace-scoped data. --- ## ✨ Highlights - **Agent execution across devices** — Unifies per-device working directories, project skill discovery, and sub-agent suspend/resume behavior across server, QStash, and device RPC flows. (#15543, #15566, #15481, #15620, #15591) - **Connector and sandbox platform** — Expands connector permissions, custom OAuth MCP connector onboarding, sandbox provider support, and user-uploaded file sync into cloud sandbox runs. (#15463, #15546, #15184, #15550) - **Desktop and CLI reliability** — Fixes desktop cold-start, auto-update, Windows build, CLI skill discovery, and `lh connect` agent dispatch paths. (#15547, #15525, #15527, #15562, #15632, #15634) - **Pages and sharing** — Refreshes topic sharing, improves Page Editor layout behavior, and routes Page Agent tool execution through the server-side editor path. (#15581, #15556, #15588, #15023, #15610) - **Model availability and provider updates** — Adds user-scoped LobeHub model availability, Claude Fable 5, Qwen thinking preservation, and MiniMax M3 updates. (#15590, #15639, #13494, #15376) --- ## 🏗️ Core Product & Architecture ### Agent Runtime & Heterogeneous Agents - Improves sub-agent lifecycle handling, including async suspend/resume, queue-mode QStash resume delivery, and blocking nested sub-agent calls. (#15481, #15620, #15575) - Stabilizes heterogeneous agent ingestion and streaming with raw stream dumps, per-turn usage, image forwarding on regenerate, and duplicate-text fixes. (#15602, #15577, #15592, #15585) - Adds execution-device and working-directory controls across device RPC, legacy defaults, and remote-spawned Claude Code sessions. (#15543, #15566, #15591, #15572) - Improves runtime diagnostics and compatibility, including Gemini multimodal output capture, abort stream semantics, and trace quality analysis. (#15535, #13677, #15508) --- ## 📱 Platforms, Integrations & UX ### Connectors, Sandbox & Tools - Ships API-level connector tool permissions, custom OAuth MCP connector onboarding, and connector-first runtime execution. (#15463, #15546) - Adds sandbox provider support, cloud sandbox file sync, and safer external URL file input handling with SSRF validation. (#15184, #15550, #12657) - Improves tool visibility and execution with pinned app-fixed tools, ANSI output rendering, gateway-tunneled MCP calls, and automatic headless tool runs. (#15509, #15516, #15469, #15492) ### Desktop, CLI & Web UX - Restores desktop startup and reload behavior, preserves IPC error causes, and keeps the tab bar new-tab action visible across routes. (#15547, #15597, #15638) - Fixes desktop update and build stability for browser quit guards, macOS update signing, and Windows Visual Studio detection. (#15525, #15527, #15562) - Shows the plan-limit upgrade UI on desktop builds. (#15628) - Adds the Agent Run delivery checker and fixes CLI device dispatch plus skill list/search output. (#15489, #15634, #15632) - Refreshes onboarding, auth source preservation, topic UI states, referral/Fable campaign copy, and chat-input control bar behavior. (#15629, #15544, #15573, #15614, #15616, #15617, #15622, #15643) --- ## 🔒 Security, Reliability & Rollout Notes - External URL file input now includes SSRF validation for safer Google file handling. (#12657) - Database workspace-scope migrations are part of this release; self-hosted operators should run the normal migration path before serving the updated app. (#15446, #15465, #15468, #15472) - The release branch was re-cut from `canary` and includes the latest `main` release-version commit so `v2.2.2` is the verified compare base. --- ## 👥 Contributors @ONLY-yours, @sxjeru, @hardy-one, @xujingli, @hezhijie0327, @Coooolfan, @arvinxx, @tjx666, @Innei, @rivertwilight, @rdmclin2, @cy948, @AmAzing129 **Full Changelog**: v2.2.2...release/weekly-20260610-recut-3
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
💻 Change Type
🔗 Related Issue
🔀 Description of Change
Gemini 2.5+/3 thinking streams deliver assistant text and reasoning as
content_part/reasoning_partevents instead of plaintext/reasoning(driven by
thoughtparts /thoughtSignaturein the Google stream transformer).The agent runtime registered no
onContentPart/onReasoningParthandlers, sothe text was silently dropped:
onCompletionstill reported usage tokens, theempty-completion guard saw
outputTokens > 0, and the turn finalized to a blankdone— the answer was lost in the DB message, the client stream, and the tracesnapshot alike.
This adds the two missing handlers:
onText/onThinking(accumulate + buffer + publish), sostreaming, persistence and tracing all capture the content again.
FileService.uploadBase64and the multimodal content is serialized with the resulting S3 URLs (text +
images, in order) — raw base64 is never persisted. Uploads run concurrently with
the stream and are awaited before the message is finalized.
thoughtSignatureis intentionally not persisted — the existing magic-bypasstoken (
skip_thought_signature_validator) keeps handling multi-turn replay, socontextBuilders/google.tsis untouched.🧪 How to Test
Added 4 cases to
RuntimeExecutors.test.ts(full file: 109 passed):content_partis captured (regression: was blank)reasoning_partis capturedcontent_parttext chunks coalescecontent_partimages upload to storage and serialize as URLs, asserting thepersisted content contains no raw base64 and
metadata.isMultimodal: truebun run type-checkpasses. Logic-level (mockedFileService+ stream callbacks);a live Gemini-3 multi-turn pass is still recommended as the final E2E check.
📝 Additional Information
Scope is the server agent runtime (
RuntimeExecutors). No schema/migration changes.