β¨ feat(desktop): screen capture overlay, Quick Chat tray, and upload pipeline improvements#13818
Conversation
- Implemented ScreenCaptureManager to handle screen capture sessions. - Added ScreenCaptureCtr for IPC methods related to screen capture. - Created overlay.html and ScreenCaptureOverlay component for user interaction. - Integrated window enumeration and capture logic using node-screenshots and get-windows. - Updated menu options to include screen capture actions. - Enhanced RendererUrlManager to support overlay routing. - Introduced drag selection for capturing specific screen areas. - Added necessary types and events for screen capture in electron-client-ipc. Signed-off-by: Innei <tukon479@gmail.com>
There was a problem hiding this comment.
Sorry @Innei, your pull request is larger than the review limit of 150000 diff characters
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Codecov Reportβ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## canary #13818 +/- ##
==========================================
+ Coverage 66.99% 67.00% +0.01%
==========================================
Files 2105 2107 +2
Lines 179853 179914 +61
Branches 18608 18622 +14
==========================================
+ Hits 120488 120550 +62
+ Misses 59242 59241 -1
Partials 123 123
Flags with carried forward coverage won't be shown. Click here to find out more.
π New features to boost your workflow:
|
There was a problem hiding this comment.
π‘ Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cee1054f98
βΉοΈ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with π.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| <option key={item.id} value={item.id}> | ||
| {item.displayName ?? item.id} |
There was a problem hiding this comment.
Use provider-aware model option values in overlay selector
The model dropdown uses only item.id as both option key and value, but the overlay model list is provider-scoped and can contain the same model ID from multiple providers (for example common IDs reused across OpenAI-compatible providers). When IDs collide, the browser treats options as the same value and currentModel resolution becomes ambiguous, so users cannot reliably select the intended provider/model pair.
Useful? React with πΒ / π.
- Replaced `join` imports with `path` imports for consistency across files. - Enhanced error handling in various modules to include error causes for better debugging. - Updated test files to reflect changes in variable naming and mock implementations. Signed-off-by: Innei <tukon479@gmail.com>
The `quickComposer` hotkey is registered only on the Electron side (DESKTOP_GLOBAL_SHORTCUT_DEFAULTS + BrowserWindowsCtr.openQuickComposer); the renderer never referenced these i18n keys, so the entries were dead. `desktop.quickComposer` covers the app-level trigger.
β¦tion Overlay submit used to await screenshot upload before router.push, blocking the main window for several seconds when the user was on an unrelated page (e.g. /settings). Now we navigate immediately and run upload in a background IIFE; MessageFromUrl waits on a new `uploadStatus` field before calling sendMessage, so the chat page mount and the upload proceed in parallel. - Add `uploadStatus: 'uploading' | 'ready' | 'failed'` to PendingOverlayDispatch; canConsumePendingOverlayDispatch blocks while `'uploading'`. - Store gains `markDispatchUploadComplete`; on failure it clears screenshotFileNames so the prompt still delivers. - Dispatcher drops stale prev search params on push to prevent MessageFromUrl's message-param effect from double-firing.
β¦-thumbnail status Move uploads from post-submit to preview time, bypassing dataUrl round-trips: - Main process assigns captureId at preview time and ships the PNG bytes as ArrayBuffer to the main renderer via `overlayUploadRequest`. - Main renderer uploads through a dedicated pool (uploadWithProgress, no chatUploadFileList pollution); reports status back to the overlay through `overlayCaptureUploadStatus`. - Overlay thumbnails render a spinner / error badge based on status; the send button stays grey until every capture resolves to `ready`. - Submit now carries only captureIds; MessageFromUrl awaits the pool promises before sendMessage, removing the second upload pass. - Carry overlay-selected modelId/provider into the agent config so the first message actually uses the user-chosen model (fixes the bug where switching the model on the overlay had no effect).
Tray menu now exposes a "Quick Chat" action that opens (or focuses) a single-instance popup window at `/popup/agent/inbox`. Each fresh open starts with no active topic; the first message creates one through the normal agent flow. - New `PopupAgentQuickPage` resolves the inbox slug via `builtinAgentSelectors.inboxAgentId` so `activeAgentId` points at the real entity in `agentMap` (fixes the stuck-loading / skeleton state from using the literal `'inbox'` slug). - `BrowserManager.openQuickChatPopup` wraps `createMultiInstanceWindow` with a fixed `topicPopup_quick_inbox` uniqueId so repeat clicks focus rather than spawn. - Wire the action into macOS / Windows / Linux tray menus and add the `tray.quickChat` i18n key.
β¦support - Updated `enumerateWindows` to accept an optional `displayScaleFactor` parameter for improved window geometry normalization on high-DPI displays. - Refactored `normalizeWindowBounds` to handle scaling based on the provided scale factor, ensuring accurate window dimensions across different platforms. - Adjusted tests in `WindowSourceService.test.ts` to validate the new scaling behavior for both Windows and macOS environments. - Minor adjustments in `ScreenCaptureManager` to accommodate the updated window enumeration logic.
# Conflicts: # apps/desktop/src/main/core/App.ts # apps/desktop/src/main/utils/permissions.ts
π Desktop App Build Completed!Version: π¦ Release Download Β· π₯ Actions Artifacts Build Artifacts
Warning Note: This is a temporary build for testing purposes only. |
# π LobeHub v2.1.53 (20260427) **Release Date:** April 27, 2026 **Since v2.1.52:** 194 merged PRs Β· 17 contributors > Introduce Heterogeneous Agent β Claude Code and Codex run as first-class desktop runtimes, paired with a new Agent Signal package, sharper desktop UX, and a wave of flagship model additions. --- ## β¨ Highlights - **Introduce Heterogeneous Agent** β Claude Code and Codex run as first-class desktop agents: subagent rendering, partial-message streaming, multi-turn resume, terminal error surfacing, rich tool inspectors, and runtime polish. (#14162, #13754, #14067, #14001, #13970, #13942) - **Screen capture & Quick Chat tray** β New desktop screen capture overlay (macOS permission-gated) with Quick Chat tray and upload pipeline improvements; chat input auto-focuses on overlay mount. (#13818, #14097, #14105) - **Desktop topic & tab UX** β Dedicated topic popup window with cross-window sync, Cmd+W/Cmd+T tab shortcuts, TabBar polish, recent working directories expanded to 20, and human approval notifications. (#13957, #13983, #13972, #14036, #14092) - **Git workflow built-in** β One-click pull/push from the branch chip, ahead/behind badge, and submodule/worktree repo detection. (#14041, #13980, #13978) - **Agent Signal package** β New `@lobechat/agent-signal` runtime for dynamic memory feedback signals, with OTel metrics and self-iteration in Lab. (#14157, #14170, #14159, #14169, #14187) - **New models** β Claude Opus 4.7 with `xhigh` effort tier, GPT-5.5, DeepSeek V4 Flash/Pro with reasoning slider, Kimi K2.6, MiMo-V2.5/Pro, gpt-image-2, Qwen3.6 Flash/Plus, and Pixverse-c1. (#13903, #14147, #14114, #14004, #14089, #14039, #13923) - **New providers** β OpenCode Zen, OpenCode Go, and Azure OpenAI Router runtime. (#13943, #14064, #13823) - **Mobile settings overhaul** β Full settings menu and responsive profile layout for mobile. (#14019) --- ## ποΈ Heterogeneous Agent - Claude Code runtime, working-directory awareness, and sidebar polish. (#13970) - CC subagent rendering with persistent streamed text; parallel-tool orphan fix. (#14001, #13968, #14024) - Per-step usage persisted to each step assistant message. (#13964) - Per-phase workflow expand defaults; full-expand toggle with three-level expansion. (#14171, #13906) - Hetero-mode actions bar; tool inspector polish. (#13963, #14034, #14030) - Codex desktop integration with rich tool rendering and devtools preview. (#14067, #14100) - Codex terminal error surfacing and CLI output tracing. (#14166) - Tighten `isCanUseVision` default and add aggregator fallback. (#14172) - Persist `ccSessionId` in topic metadata for CC multi-turn resume. (#13902) - CC account card, topic filter, and integration polish. (#13955, #13942, #13950) - Token-level deltas streamed via `--include-partial-messages`. (#13929) --- ## π§ Agent Signal & Self-Iteration - New `@lobechat/agent-signal` package with dynamic feedback signals. (#14157) - AgentSignalRuntime wired through agent-tracing and observability-otel metrics. (#14170, #14159) - Self-iteration feature flag added to Lab; front-side flag check. (#14169, #14186) - Signal policy for receiving memory feedback dynamically. (#14187) --- ## π¬ Conversation - Queue follow-up sends during running CC turns. (#14179) - Persist per-topic chat scroll position; pin user message + fold long messages. (#14191, #14056) - Inline resend when editing last user message. (#14080) - Disable first-block markdown streaming to prevent flicker. (#14193, #13904) - Prevent Markdown stream replay when vlist remounts streaming items. (#14086) - Stop repinning after manual scroll; unify scroll-to-user + spacer hooks. (#14099, #14132) --- ## π± Platforms & Integrations ### Desktop / Electron - Screen capture overlay, Quick Chat tray, and upload pipeline improvements. (#13818) - macOS permission gate for screen capture; auto-focus chat panel input. (#14097, #14105) - Dedicated topic popup window with cross-window sync. (#13957) - TabBar polish: `+` button for new topic, dark theme blend, close icon by default. (#13972, #14203, #13973) - Recent working directories expanded from 5 to 20; submodule/worktree repo detection. (#14036, #13978) - Cmd+W / Cmd+T tab shortcuts and global shortcut consolidation. (#13983, #13880) - Linux icon configuration; human approval desktop notifications. (#14042, #14092) ### Git Workflow - One-click pull/push from branch chip; ahead/behind badge with refactored GitCtr. (#14041, #13980) ### Mobile - Full settings menu and responsive profile layout. (#14019) - Agent route added to mobile router; mobile agent topic route registered. (#14103, #14158) - Session list skeleton row layout corrected. (#14040) ### Bot / Messaging - DM strategy support; bot emoji and markdown render optimization. (#14201, #14091, #14140) - Slack webhook fix; bot platform setup guide reference. (#14052, #14121) --- ## π€ Models & Providers ### New models - **Claude Opus 4.7** with `xhigh` effort tier; strip temperature/top_p. (#13903, #13909) - **GPT-5.5**. (#14147) - **DeepSeek V4** Flash/Pro cards with reasoning slider; cache-hit and Pro discount pricing. (#14114, #14209, #14196, #14131) - **Kimi K2.6** model with LobeHub-hosted card. (#14004, #14006) - **MiMo-V2.5 / V2.5-Pro**. (#14089) - **gpt-image-2**, **Qwen3.6 Flash/Plus**, **Pixverse-c1**. (#14039, #13923) ### New providers - **OpenCode Zen** and **OpenCode Go** with env-var support. (#13943, #14064) - **Azure OpenAI Router** runtime support. (#13823) - Model alias mapping for image and video runtimes. (#13896) - Seedance video models migrated to Dreamina. (#14144) ### Runtime reliability - Sanitize invalid tool_call arguments to unbreak strict providers. (#14033) - Tolerate null `function.name` in streaming tool_call deltas. (#14139) - Preserve Gemini 3 `thoughtSignature` in `call_tools_batch` normalization. (#14032) - Downgrade `image_url` parts when target model lacks vision. (#14029) - Preserve Cloudflare provider error context. (#14136) - Use `safety_identifier` for OpenAI Responses API. (#14148) - Unwrap underlying PG error in `formatErrorEventData`. (#14038) --- ## π₯οΈ User Experience - **Onboarding** β Preset agent naming suggestions, structured hunk ops for `updateDocument`, persona analytics snapshot, footer promotion pipeline, wrap-up button. (#13931, #13989, #13930, #13853, #13934) - **Document workflow** β Agent documents promoted as primary workspace panel; history management and compare workflow; web-crawl docs associated with agent documents. (#13924, #13725, #13893) - **cmdk** β Agent identity surfaced on topic search results; topic/message search scoped to current agent. (#14204, #13960) - **Floating chat panel** and workspace improvements. (#13887) - **Topic completion status** with dropdown action and filter. (#14005) --- ## π§ Tooling - Redis-backed feature flag provider for runtime config. (#14098) - Vite upgraded to 8.0.0 with Rolldown strict execution order. (#12720, #14058) - `@lobechat/model-bank` automated npm release with provenance. (#14015, #14017, #14018) - Skill activation fallback when `activateTools` cannot find identifier. (#14010) - Cron tool: timezone and existing jobs injected into system prompt; clarified `lobe-gtd` and `lobe-cron` descriptions. (#14012, #14013) --- ## π Security & Reliability - **Security:** uuid bumped to v14 (advisory). (#14083) - **Security:** validate avatar URL and scope old-avatar deletion to owner. (#13982) - **Security:** clear OIDC sessions on better-auth signout; return 401 (not 500) for expired OIDC JWT. (#13916, #14014) - **Reliability:** scope pending-approval check to current assistant turn. (#14182) - **Reliability:** sanitize heterogeneous-agent attachment cache filenames. (#13937) - **Reliability:** reduce subagent task status error noise. (#14026) --- ## π₯ Contributors Huge thanks to **17 contributors** who shipped **194 merged PRs** this week. @hardy Β· @shaun0927 Β· @hezhijie0327 Β· @sxjeru Β· @arvinxx Β· @Innei Β· @tjx666 Β· @lijian Β· @neko Β· @rdmclin2 Β· @AmAzing129 Β· @sudongyuer Β· @CanisMinor Β· @rivertwilight Plus @lobehubbot and renovate[bot] for maintenance. --- **Full Changelog**: v2.1.52...v2.1.53
π» Change Type
π Related Issue
Fixes LOBE-5521
Fixes LOBE-7000
π Description of Change
Ships the desktop Screen Capture Overlay feature along with a set of follow-up refinements. The overlay is a lightweight Electron window that lets the user drag-select a region or a native window, preview the capture, pick an agent/model, and dispatch the shot into a chat conversation.
Overlay & capture (core feature)
apps/desktop/src/overlay/**) with drag selection, window highlight, dock-aware placement, chat panel with prompt input, multi-capture tray, and avatar/window taggingscreenCapture/CaptureService,WindowSourceService,ScreenCaptureManager) backed byopenWindowsSyncand native mime/permission helpers@lobechat/electron-client-ipc/screenCapture) wired throughScreenCaptureCtrsrc/features/Electron/ScreenCapture/**) bridge overlay selections into the main chatOverlay upload performance
ArrayBufferto the main renderer via aoverlayUploadRequestbroadcast; the renderer uploads through a dedicated pool (overlayCaptureUploadPool) bypassingchatUploadFileList, and reports status (uploading/ready/failed) back to the overlayreadyMessageFromUrlawaits the pool promises and callssendMessageonce, removing the second upload passmodelId/providerare threaded intoupdateAgentConfigByIdbefore send, so the first message actually uses the user-chosen modelMeasured overlay startup path during local profiling:
overlay.open.windows-enumerated: 216 ms β 159 msoverlay.open.did-finish-load: 389 ms β 327 msoverlay.session.first-raf: 416 ms β 357 msQuick Composer hotkey
Optionopens the overlay (ShortcutManager+ double-option monitor with hardware option key state); separate binding from legacy quick composerquickComposeri18n entries and the stale acceleratorTray icon refresh (macOS template)
tray-dark.png/tray-light.pngwith a macOS template image (trayTemplate.png,@2x) so the system recolors automatically for light / dark menu barsapps/desktop/scripts/generate-tray-template.mjsto derive the template from source artTray/TrayManagerflag template images viasetTemplateImage(true)Quick Chat popup (tray entry)
/popup/agent/inboxviaBrowserManager.openQuickChatPopup(fixeduniqueIdβ repeat clicks focus, closing + reopening resets to a fresh conversation)PopupAgentQuickPageresolves the inbox slug throughbuiltinAgentSelectors.inboxAgentIdsoactiveAgentIdpoints at the real entity inagentMapβ avoids the stuck-loading skeleton that would appear when storing the literal'inbox'slugtray.quickChati18n keyMisc
π§ͺ How to Test
Executed:
bunx vitest run --silent='passed-only' 'apps/desktop/src/main/modules/screenCapture/ScreenCaptureManager.test.ts'bunx vitest run --silent='passed-only' 'apps/desktop/src/main/modules/screenCapture/WindowSourceService.test.ts'bunx vitest run --silent='passed-only' 'apps/desktop/src/overlay/ChatPanel.test.tsx'bunx vitest run --silent='passed-only' 'src/features/Electron/ScreenCapture/overlayDispatch.test.ts'bun run build:mainManual smoke:
Optionopens the overlayπΈ Screenshots / Videos
π Additional Information
apps/desktop/overlay.html) so it loads with its own bundle and does not pay for the main SPA startup costuseFileStore.getState().uploadWithProgressdirectly to keep screenshots out of the main input's preview tray (chatUploadFileList)