Skip to content

🐛 fix(agent-runtime): capture Gemini multimodal content_part/reasoning_part output#15535

Merged
arvinxx merged 1 commit into
canaryfrom
arvinxx/fix/gemini-content-part-capture
Jun 8, 2026
Merged

🐛 fix(agent-runtime): capture Gemini multimodal content_part/reasoning_part output#15535
arvinxx merged 1 commit into
canaryfrom
arvinxx/fix/gemini-content-part-capture

Conversation

@arvinxx

@arvinxx arvinxx commented Jun 8, 2026

Copy link
Copy Markdown
Member

💻 Change Type

  • 🐛 fix

🔗 Related Issue

🔀 Description of Change

Gemini 2.5+/3 thinking streams deliver assistant text and reasoning as
content_part / reasoning_part events instead of plain text / reasoning
(driven by thought parts / thoughtSignature in the Google stream transformer).
The agent runtime registered no onContentPart / onReasoningPart handlers, so
the text was silently dropped: onCompletion still reported usage tokens, the
empty-completion guard saw outputTokens > 0, and the turn finalized to a blank
done — the answer was lost in the DB message, the client stream, and the trace
snapshot alike.

This adds the two missing handlers:

  • Text parts mirror onText / onThinking (accumulate + buffer + publish), so
    streaming, persistence and tracing all capture the content again.
  • Image parts are uploaded to object storage via FileService.uploadBase64
    and the multimodal content is serialized with the resulting S3 URLs (text +
    images, in order) — raw base64 is never persisted. Uploads run concurrently with
    the stream and are awaited before the message is finalized.

thoughtSignature is intentionally not persisted — the existing magic-bypass
token (skip_thought_signature_validator) keeps handling multi-turn replay, so
contextBuilders/google.ts is untouched.

🧪 How to Test

  • Added/updated tests

Added 4 cases to RuntimeExecutors.test.ts (full file: 109 passed):

  1. assistant text delivered via content_part is captured (regression: was blank)
  2. reasoning delivered via reasoning_part is captured
  3. consecutive content_part text chunks coalesce
  4. content_part images upload to storage and serialize as URLs, asserting the
    persisted content contains no raw base64 and metadata.isMultimodal: true

bun run type-check passes. Logic-level (mocked FileService + stream callbacks);
a live Gemini-3 multi-turn pass is still recommended as the final E2E check.

📝 Additional Information

Scope is the server agent runtime (RuntimeExecutors). No schema/migration changes.

@dosubot dosubot Bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Jun 8, 2026

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @arvinxx, you have reached your weekly rate limit of 500000 diff characters.

Please try again later or upgrade to continue using Sourcery

@vercel

vercel Bot commented Jun 8, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
lobehub Ready Ready Preview, Comment Jun 8, 2026 2:17am

Request Review

@codecov

codecov Bot commented Jun 8, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 82.02247% with 16 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.71%. Comparing base (fc0daa7) to head (a59f248).
⚠️ Report is 1 commits behind head on canary.

Additional details and impacted files
@@           Coverage Diff           @@
##           canary   #15535   +/-   ##
=======================================
  Coverage   70.71%   70.71%           
=======================================
  Files        3285     3285           
  Lines      324605   324692   +87     
  Branches    34484    34501   +17     
=======================================
+ Hits       229549   229621   +72     
- Misses      94873    94888   +15     
  Partials      183      183           
Flag Coverage Δ
app 61.49% <82.02%> (+<0.01%) ⬆️
database 92.49% <ø> (ø)
packages/agent-manager-runtime 49.69% <ø> (ø)
packages/agent-runtime 81.06% <ø> (ø)
packages/builtin-tool-lobe-agent 18.52% <ø> (ø)
packages/context-engine 84.19% <ø> (ø)
packages/conversation-flow 91.29% <ø> (ø)
packages/device-gateway-client 90.18% <ø> (ø)
packages/eval-dataset-parser 95.15% <ø> (ø)
packages/eval-rubric 76.11% <ø> (ø)
packages/fetch-sse 85.57% <ø> (ø)
packages/file-loaders 87.89% <ø> (ø)
packages/memory-user-memory 74.99% <ø> (ø)
packages/model-bank 99.99% <ø> (ø)
packages/model-runtime 84.22% <ø> (ø)
packages/prompts 72.51% <ø> (ø)
packages/python-interpreter 92.90% <ø> (ø)
packages/ssrf-safe-fetch 0.00% <ø> (ø)
packages/types 35.17% <ø> (ø)
packages/utils 84.98% <ø> (ø)
packages/web-crawler 88.08% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
Store 68.37% <ø> (ø)
Services 54.90% <ø> (ø)
Server 72.08% <82.02%> (+0.01%) ⬆️
Libs 54.45% <ø> (ø)
Utils 81.93% <ø> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…g_part output

Gemini 2.5+/3 thinking streams deliver assistant text and reasoning as
content_part/reasoning_part events instead of plain text/reasoning. The
runtime registered no onContentPart/onReasoningPart handlers, so the text
was silently dropped: onCompletion still reported usage tokens, the
empty-completion guard saw outputTokens > 0, and the turn finalized to a
blank `done` (lost in DB, client stream and trace alike).

Add the two handlers, mirroring onText/onThinking for text parts so
streaming, persistence and tracing all capture the content. Image parts
are uploaded to object storage and serialized as multimodal content
(text + image URLs, in order) — never persisting raw base64.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@arvinxx arvinxx force-pushed the arvinxx/fix/gemini-content-part-capture branch from b5ec8a5 to a59f248 Compare June 8, 2026 02:07
@arvinxx arvinxx merged commit 0ac53b4 into canary Jun 8, 2026
35 checks passed
@arvinxx arvinxx deleted the arvinxx/fix/gemini-content-part-capture branch June 8, 2026 06:55
arvinxx added a commit that referenced this pull request Jun 10, 2026
# 🚀 LobeHub Release (20260610)

**Release Date:** June 10, 2026  
**Since v2.2.2:** 131 merged PRs · 13 contributors

> This weekly release strengthens agent collaboration across cloud,
desktop, CLI, and workspace flows, with steadier runtime behavior and a
broader foundation for workspace-scoped data.

---

## ✨ Highlights

- **Agent execution across devices** — Unifies per-device working
directories, project skill discovery, and sub-agent suspend/resume
behavior across server, QStash, and device RPC flows. (#15543, #15566,
#15481, #15620, #15591)
- **Connector and sandbox platform** — Expands connector permissions,
custom OAuth MCP connector onboarding, sandbox provider support, and
user-uploaded file sync into cloud sandbox runs. (#15463, #15546,
#15184, #15550)
- **Desktop and CLI reliability** — Fixes desktop cold-start,
auto-update, Windows build, CLI skill discovery, and `lh connect` agent
dispatch paths. (#15547, #15525, #15527, #15562, #15632, #15634)
- **Pages and sharing** — Refreshes topic sharing, improves Page Editor
layout behavior, and routes Page Agent tool execution through the
server-side editor path. (#15581, #15556, #15588, #15023, #15610)
- **Model availability and provider updates** — Adds user-scoped LobeHub
model availability, Claude Fable 5, Qwen thinking preservation, and
MiniMax M3 updates. (#15590, #15639, #13494, #15376)

---

## 🏗️ Core Product & Architecture

### Agent Runtime & Heterogeneous Agents

- Improves sub-agent lifecycle handling, including async suspend/resume,
queue-mode QStash resume delivery, and blocking nested sub-agent calls.
(#15481, #15620, #15575)
- Stabilizes heterogeneous agent ingestion and streaming with raw stream
dumps, per-turn usage, image forwarding on regenerate, and
duplicate-text fixes. (#15602, #15577, #15592, #15585)
- Adds execution-device and working-directory controls across device
RPC, legacy defaults, and remote-spawned Claude Code sessions. (#15543,
#15566, #15591, #15572)
- Improves runtime diagnostics and compatibility, including Gemini
multimodal output capture, abort stream semantics, and trace quality
analysis. (#15535, #13677, #15508)

---

## 📱 Platforms, Integrations & UX

### Connectors, Sandbox & Tools

- Ships API-level connector tool permissions, custom OAuth MCP connector
onboarding, and connector-first runtime execution. (#15463, #15546)
- Adds sandbox provider support, cloud sandbox file sync, and safer
external URL file input handling with SSRF validation. (#15184, #15550,
#12657)
- Improves tool visibility and execution with pinned app-fixed tools,
ANSI output rendering, gateway-tunneled MCP calls, and automatic
headless tool runs. (#15509, #15516, #15469, #15492)

### Desktop, CLI & Web UX

- Restores desktop startup and reload behavior, preserves IPC error
causes, and keeps the tab bar new-tab action visible across routes.
(#15547, #15597, #15638)
- Fixes desktop update and build stability for browser quit guards,
macOS update signing, and Windows Visual Studio detection. (#15525,
#15527, #15562)
- Shows the plan-limit upgrade UI on desktop builds. (#15628)
- Adds the Agent Run delivery checker and fixes CLI device dispatch plus
skill list/search output. (#15489, #15634, #15632)
- Refreshes onboarding, auth source preservation, topic UI states,
referral/Fable campaign copy, and chat-input control bar behavior.
(#15629, #15544, #15573, #15614, #15616, #15617, #15622, #15643)

---

## 🔒 Security, Reliability & Rollout Notes

- External URL file input now includes SSRF validation for safer Google
file handling. (#12657)
- Database workspace-scope migrations are part of this release;
self-hosted operators should run the normal migration path before
serving the updated app. (#15446, #15465, #15468, #15472)
- The release branch was re-cut from `canary` and includes the latest
`main` release-version commit so `v2.2.2` is the verified compare base.

---

## 👥 Contributors

@ONLY-yours, @sxjeru, @hardy-one, @xujingli, @hezhijie0327, @Coooolfan,
@arvinxx, @tjx666, @Innei, @rivertwilight, @rdmclin2, @cy948,
@AmAzing129

**Full Changelog**:
v2.2.2...release/weekly-20260610-recut-3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

provider:gemini size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant