🐛 fix(context-engine): downgrade image_url parts when target model lacks vision#14029
Conversation
…cks vision Historical messages persisted as multimodal parts (content is an array with `image_url` entries, or assistant messages with `metadata.isMultimodal`) bypassed the legacy `imageList` vision check and got forwarded verbatim to the provider. DeepSeek rejects the `image_url` variant outright, so any topic containing an image broke the moment the user switched to a non-vision model. Replace image parts with a textual placeholder so the conversation still carries the signal that an image was sent, without including content non-vision providers reject. Applies uniformly across user array content, assistant multimodal content, and legacy `imageList` paths. Fixes LOBE-7214. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## canary #14029 +/- ##
==========================================
+ Coverage 66.86% 66.92% +0.06%
==========================================
Files 2100 2103 +3
Lines 179535 179831 +296
Branches 21192 18576 -2616
==========================================
+ Hits 120038 120346 +308
+ Misses 59373 59362 -11
+ Partials 124 123 -1
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b3792d50ae
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const needsArrayRewrite = contentIsArray && arrayImageUrlCount > 0 && !canUseVision; | ||
|
|
||
| // Fast path: nothing to transform — plain text content passes through. | ||
| if (!hasImages && !hasVideos && !hasFiles && !needsArrayRewrite) { |
There was a problem hiding this comment.
Preserve video parts when rewriting image_url array content
When needsArrayRewrite is true for non-vision models, user content arrays are forced through the rewrite path even if they already contain other multimodal parts. That path only rebuilds text plus image placeholders, so existing video_url parts are dropped from history. This is a regression from the previous fast-path behavior (which preserved the original array) and causes mixed image+video turns to silently lose video context when switching to a non-vision model.
Useful? React with 👍 / 👎.
Two tests in the app suite asserted the silent-drop behavior the MessageContentProcessor used to exhibit for `imageList` + vision-off: - src/services/chat/chat.test.ts - src/services/chat/mecha/contextEngineering.test.ts After this PR the processor appends the downgrade placeholder instead of silently dropping the image, so the expected content grows by one line. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
image_url parts when target model lacks vision
…STEM CONTEXT The placeholder stands in for an image the user actually sent, so it should sit adjacent to the user text rather than trailing after the SYSTEM CONTEXT metadata block. Reorder so the payload reads: <user text> [image omitted: not supported by this model] <!-- SYSTEM CONTEXT ... --> Keeps the conversational flow intact and matches the semantic position the image occupied in the original message. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
# 🚀 LobeHub v2.1.53 (20260427) **Release Date:** April 27, 2026 **Since v2.1.52:** 194 merged PRs · 17 contributors > Introduce Heterogeneous Agent — Claude Code and Codex run as first-class desktop runtimes, paired with a new Agent Signal package, sharper desktop UX, and a wave of flagship model additions. --- ## ✨ Highlights - **Introduce Heterogeneous Agent** — Claude Code and Codex run as first-class desktop agents: subagent rendering, partial-message streaming, multi-turn resume, terminal error surfacing, rich tool inspectors, and runtime polish. (#14162, #13754, #14067, #14001, #13970, #13942) - **Screen capture & Quick Chat tray** — New desktop screen capture overlay (macOS permission-gated) with Quick Chat tray and upload pipeline improvements; chat input auto-focuses on overlay mount. (#13818, #14097, #14105) - **Desktop topic & tab UX** — Dedicated topic popup window with cross-window sync, Cmd+W/Cmd+T tab shortcuts, TabBar polish, recent working directories expanded to 20, and human approval notifications. (#13957, #13983, #13972, #14036, #14092) - **Git workflow built-in** — One-click pull/push from the branch chip, ahead/behind badge, and submodule/worktree repo detection. (#14041, #13980, #13978) - **Agent Signal package** — New `@lobechat/agent-signal` runtime for dynamic memory feedback signals, with OTel metrics and self-iteration in Lab. (#14157, #14170, #14159, #14169, #14187) - **New models** — Claude Opus 4.7 with `xhigh` effort tier, GPT-5.5, DeepSeek V4 Flash/Pro with reasoning slider, Kimi K2.6, MiMo-V2.5/Pro, gpt-image-2, Qwen3.6 Flash/Plus, and Pixverse-c1. (#13903, #14147, #14114, #14004, #14089, #14039, #13923) - **New providers** — OpenCode Zen, OpenCode Go, and Azure OpenAI Router runtime. (#13943, #14064, #13823) - **Mobile settings overhaul** — Full settings menu and responsive profile layout for mobile. (#14019) --- ## 🏗️ Heterogeneous Agent - Claude Code runtime, working-directory awareness, and sidebar polish. (#13970) - CC subagent rendering with persistent streamed text; parallel-tool orphan fix. (#14001, #13968, #14024) - Per-step usage persisted to each step assistant message. (#13964) - Per-phase workflow expand defaults; full-expand toggle with three-level expansion. (#14171, #13906) - Hetero-mode actions bar; tool inspector polish. (#13963, #14034, #14030) - Codex desktop integration with rich tool rendering and devtools preview. (#14067, #14100) - Codex terminal error surfacing and CLI output tracing. (#14166) - Tighten `isCanUseVision` default and add aggregator fallback. (#14172) - Persist `ccSessionId` in topic metadata for CC multi-turn resume. (#13902) - CC account card, topic filter, and integration polish. (#13955, #13942, #13950) - Token-level deltas streamed via `--include-partial-messages`. (#13929) --- ## 🧠 Agent Signal & Self-Iteration - New `@lobechat/agent-signal` package with dynamic feedback signals. (#14157) - AgentSignalRuntime wired through agent-tracing and observability-otel metrics. (#14170, #14159) - Self-iteration feature flag added to Lab; front-side flag check. (#14169, #14186) - Signal policy for receiving memory feedback dynamically. (#14187) --- ## 💬 Conversation - Queue follow-up sends during running CC turns. (#14179) - Persist per-topic chat scroll position; pin user message + fold long messages. (#14191, #14056) - Inline resend when editing last user message. (#14080) - Disable first-block markdown streaming to prevent flicker. (#14193, #13904) - Prevent Markdown stream replay when vlist remounts streaming items. (#14086) - Stop repinning after manual scroll; unify scroll-to-user + spacer hooks. (#14099, #14132) --- ## 📱 Platforms & Integrations ### Desktop / Electron - Screen capture overlay, Quick Chat tray, and upload pipeline improvements. (#13818) - macOS permission gate for screen capture; auto-focus chat panel input. (#14097, #14105) - Dedicated topic popup window with cross-window sync. (#13957) - TabBar polish: `+` button for new topic, dark theme blend, close icon by default. (#13972, #14203, #13973) - Recent working directories expanded from 5 to 20; submodule/worktree repo detection. (#14036, #13978) - Cmd+W / Cmd+T tab shortcuts and global shortcut consolidation. (#13983, #13880) - Linux icon configuration; human approval desktop notifications. (#14042, #14092) ### Git Workflow - One-click pull/push from branch chip; ahead/behind badge with refactored GitCtr. (#14041, #13980) ### Mobile - Full settings menu and responsive profile layout. (#14019) - Agent route added to mobile router; mobile agent topic route registered. (#14103, #14158) - Session list skeleton row layout corrected. (#14040) ### Bot / Messaging - DM strategy support; bot emoji and markdown render optimization. (#14201, #14091, #14140) - Slack webhook fix; bot platform setup guide reference. (#14052, #14121) --- ## 🤖 Models & Providers ### New models - **Claude Opus 4.7** with `xhigh` effort tier; strip temperature/top_p. (#13903, #13909) - **GPT-5.5**. (#14147) - **DeepSeek V4** Flash/Pro cards with reasoning slider; cache-hit and Pro discount pricing. (#14114, #14209, #14196, #14131) - **Kimi K2.6** model with LobeHub-hosted card. (#14004, #14006) - **MiMo-V2.5 / V2.5-Pro**. (#14089) - **gpt-image-2**, **Qwen3.6 Flash/Plus**, **Pixverse-c1**. (#14039, #13923) ### New providers - **OpenCode Zen** and **OpenCode Go** with env-var support. (#13943, #14064) - **Azure OpenAI Router** runtime support. (#13823) - Model alias mapping for image and video runtimes. (#13896) - Seedance video models migrated to Dreamina. (#14144) ### Runtime reliability - Sanitize invalid tool_call arguments to unbreak strict providers. (#14033) - Tolerate null `function.name` in streaming tool_call deltas. (#14139) - Preserve Gemini 3 `thoughtSignature` in `call_tools_batch` normalization. (#14032) - Downgrade `image_url` parts when target model lacks vision. (#14029) - Preserve Cloudflare provider error context. (#14136) - Use `safety_identifier` for OpenAI Responses API. (#14148) - Unwrap underlying PG error in `formatErrorEventData`. (#14038) --- ## 🖥️ User Experience - **Onboarding** — Preset agent naming suggestions, structured hunk ops for `updateDocument`, persona analytics snapshot, footer promotion pipeline, wrap-up button. (#13931, #13989, #13930, #13853, #13934) - **Document workflow** — Agent documents promoted as primary workspace panel; history management and compare workflow; web-crawl docs associated with agent documents. (#13924, #13725, #13893) - **cmdk** — Agent identity surfaced on topic search results; topic/message search scoped to current agent. (#14204, #13960) - **Floating chat panel** and workspace improvements. (#13887) - **Topic completion status** with dropdown action and filter. (#14005) --- ## 🔧 Tooling - Redis-backed feature flag provider for runtime config. (#14098) - Vite upgraded to 8.0.0 with Rolldown strict execution order. (#12720, #14058) - `@lobechat/model-bank` automated npm release with provenance. (#14015, #14017, #14018) - Skill activation fallback when `activateTools` cannot find identifier. (#14010) - Cron tool: timezone and existing jobs injected into system prompt; clarified `lobe-gtd` and `lobe-cron` descriptions. (#14012, #14013) --- ## 🔒 Security & Reliability - **Security:** uuid bumped to v14 (advisory). (#14083) - **Security:** validate avatar URL and scope old-avatar deletion to owner. (#13982) - **Security:** clear OIDC sessions on better-auth signout; return 401 (not 500) for expired OIDC JWT. (#13916, #14014) - **Reliability:** scope pending-approval check to current assistant turn. (#14182) - **Reliability:** sanitize heterogeneous-agent attachment cache filenames. (#13937) - **Reliability:** reduce subagent task status error noise. (#14026) --- ## 👥 Contributors Huge thanks to **17 contributors** who shipped **194 merged PRs** this week. @hardy · @shaun0927 · @hezhijie0327 · @sxjeru · @arvinxx · @Innei · @tjx666 · @lijian · @neko · @rdmclin2 · @AmAzing129 · @sudongyuer · @CanisMinor · @rivertwilight Plus @lobehubbot and renovate[bot] for maintenance. --- **Full Changelog**: v2.1.52...v2.1.53
💻 Change Type
🔗 Related Issue
Fixes LOBE-7214
🔀 Description of Change
Historical messages persisted in the multimodal parts form (content is an array with
image_urlentries, or assistant messages withmetadata.isMultimodal) bypassed the legacyimageListvision check and got forwarded verbatim to the provider. DeepSeek rejects theimage_urlvariant outright (unknown variant 'image_url', expected 'text'), so any topic containing an image broke the moment the user switched to a non-vision model.This PR introduces a
VISION_DOWNGRADE_PLACEHOLDERand applies it uniformly across every path that can emit animage_urlpart:contentis already an array of parts (the LOBE-7214 trigger).convertMessagePartsToContentParts(bothmetadata.isMultimodalandreasoning.isMultimodal).imageListpath — previously the image was silently dropped.imageListpath — also switched from silent drop to placeholder, so behavior is consistent everywhere.Replacing (rather than dropping) image parts keeps the conversational signal that an image existed without including content that non-vision providers reject.
🧪 How to Test
All 763 context-engine tests pass (
bunx vitest runinpackages/context-engine). New cases cover:imageList+ non-vision model (single + multi-image).contentalready an array withimage_urlparts + non-vision model (direct LOBE-7214 regression).metadata.isMultimodal+ non-vision model → image parts replaced by placeholder text parts.imageList+ non-vision model → placeholder appended to content.📝 Additional Information
Only the
image_url/imagepart types are downgraded.video_urlparts are out of scope for this issue and left to the existingisCanUseVideopath.