Skip to content

✨ feat: add visual understanding tool#14376

Closed
tjx666 wants to merge 2 commits into
canaryfrom
refactor/lobe-agent-tool
Closed

✨ feat: add visual understanding tool#14376
tjx666 wants to merge 2 commits into
canaryfrom
refactor/lobe-agent-tool

Conversation

@tjx666

@tjx666 tjx666 commented May 1, 2026

Copy link
Copy Markdown
Member

πŸ’» Change Type

  • ✨ feat
  • πŸ› fix
  • ♻️ refactor
  • πŸ’„ style
  • πŸ‘· build
  • ⚑️ perf
  • βœ… test
  • πŸ“ docs
  • πŸ”¨ chore

πŸ”— Related Issue

Related to LOBE-8387

πŸ”€ Description of Change

  • Add the @lobechat/builtin-tool-lobe-agent package with the lobe-agent identifier.
  • Move visual media fallback execution into the Lobe Agent tool as analyzeVisualMedia.
  • Wire the client/server tool runtimes so non-visual models can use a configured visual model for image and video attachments.
  • Keep VISUAL_UNDERSTANDING_* env and server config names scoped to the visual capability.

πŸ§ͺ How to Test

  • Tested locally
  • Added/updated tests
  • No tests needed

Commands:

bunx vitest run --silent='passed-only' 'src/store/tool/slices/builtin/executors/index.test.ts' 'src/server/services/toolExecution/serverRuntimes/__tests__/lobeAgent.test.ts' 'src/helpers/toolEngineering/index.test.ts' 'src/server/modules/Mecha/AgentToolsEngine/__tests__/index.test.ts' 'src/services/agentRuntime/__tests__/index.test.ts' 'src/components/DragUpload/useDragUpload.test.tsx' 'src/hooks/useVisualMediaUploadAbility.test.ts'
cd packages/prompts && bunx vitest run --silent='passed-only' 'src/prompts/files/index.test.ts'
bunx vitest run --silent='passed-only' 'src/services/chat/chat.test.ts' 'src/services/chat/mecha/contextEngineering.test.ts'
bunx eslint packages/builtin-tool-lobe-agent/src packages/builtin-tools/src/index.ts packages/builtin-tools/src/identifiers.ts packages/prompts/src/prompts/files/index.test.ts src/store/tool/slices/builtin/executors/index.ts src/store/tool/slices/builtin/executors/index.test.ts src/server/services/toolExecution/serverRuntimes/index.ts src/server/services/toolExecution/serverRuntimes/lobeAgent.ts src/server/services/toolExecution/serverRuntimes/__tests__/lobeAgent.test.ts src/helpers/toolEngineering/index.ts src/helpers/toolEngineering/index.test.ts src/server/modules/Mecha/AgentToolsEngine/index.ts src/server/modules/Mecha/AgentToolsEngine/__tests__/index.test.ts src/server/services/aiAgent/index.ts src/services/chat/chat.test.ts src/services/chat/mecha/contextEngineering.test.ts --max-warnings=0

πŸ“Έ Screenshots / Videos

N/A

πŸ“ Additional Information

Security review: checked the submodule diff for cloud/business/billing-specific implementation details; no new cloud-specific logic is exposed.

@vercel

vercel Bot commented May 1, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
lobehub Ready Ready Preview, Comment May 1, 2026 10:48am

Request Review

@dosubot dosubot Bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label May 1, 2026

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @tjx666, you have reached your weekly rate limit of 500000 diff characters.

Please try again later or upgrade to continue using Sourcery

@dosubot dosubot Bot added feature:agent-builder Agent builder feature:tool Tool calling and function execution feature:vision labels May 1, 2026
@codecov

codecov Bot commented May 1, 2026

Copy link
Copy Markdown

Codecov Report

βœ… All modified and coverable lines are covered by tests.
βœ… Project coverage is 84.89%. Comparing base (626d274) to head (f6f8f9a).
⚠️ Report is 1 commits behind head on canary.

Additional details and impacted files
@@             Coverage Diff             @@
##           canary   #14376       +/-   ##
===========================================
+ Coverage   68.96%   84.89%   +15.92%     
===========================================
  Files        2403      589     -1814     
  Lines      209462    42088   -167374     
  Branches    26268     6441    -19827     
===========================================
- Hits       144465    35732   -108733     
+ Misses      64854     6213    -58641     
  Partials      143      143               
Flag Coverage Ξ”
app ?
database ?
packages/agent-runtime 79.93% <ΓΈ> (ΓΈ)
packages/context-engine 83.87% <ΓΈ> (ΓΈ)
packages/conversation-flow 92.40% <ΓΈ> (ΓΈ)
packages/file-loaders 87.60% <ΓΈ> (ΓΈ)
packages/memory-user-memory 74.74% <ΓΈ> (ΓΈ)
packages/model-bank 99.94% <ΓΈ> (ΓΈ)
packages/model-runtime 83.79% <ΓΈ> (ΓΈ)
packages/prompts 69.01% <100.00%> (ΓΈ)
packages/python-interpreter 92.90% <ΓΈ> (ΓΈ)
packages/ssrf-safe-fetch 0.00% <ΓΈ> (ΓΈ)
packages/utils 88.02% <ΓΈ> (ΓΈ)
packages/web-crawler 88.41% <ΓΈ> (ΓΈ)

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Ξ”
Store βˆ… <ΓΈ> (βˆ…)
Services βˆ… <ΓΈ> (βˆ…)
Server βˆ… <ΓΈ> (βˆ…)
Libs βˆ… <ΓΈ> (βˆ…)
Utils 93.47% <ΓΈ> (+13.52%) ⬆️
πŸš€ New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • πŸ“¦ JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@tjx666 tjx666 changed the title ♻️ refactor: introduce lobe agent builtin tool ✨ feat: add visual understanding tool May 1, 2026

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

πŸ’‘ Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f6f8f9a449

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with πŸ‘.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +100 to +101
const [sourceMessage] = await messageModel.queryByIds([this.messageId], {
postProcessUrl: (path) => fileService.getFullFileUrl(path),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Resolve visual source message from user turn

This runtime assumes context.messageId is the user turn with attachments, but in resume/continue flows the parent message can be an assistant message, so querying only this.messageId can return no imageList/videoList and produce NO_VISUAL_FILES even when the preceding user turn has visuals. That breaks visual-fallback tool calls during regeneration/continuation for non-vision models; the runtime should walk to the related user message (as the client executor does) instead of reading only one message id.

Useful? React with πŸ‘Β / πŸ‘Ž.

Comment on lines +14 to +20
const fallbackSupportVision = useModelSupportVision(
visualUnderstanding?.model ?? '',
visualUnderstanding?.provider ?? '',
);
const fallbackSupportVideo = useModelSupportVideo(
visualUnderstanding?.model ?? '',
visualUnderstanding?.provider ?? '',

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Do not block fallback uploads on unknown visual model cards

The fallback gate is derived from useModelSupportVision/useModelSupportVideo for the configured visualUnderstanding model, but those selectors return false when the model is not in enabledAiModels. In that case, uploads are rejected client-side even though server-side visual understanding is enabled by VISUAL_UNDERSTANDING_* envs and runtime execution can still proceed, so custom/hidden fallback model IDs become unusable from the UI.

Useful? React with πŸ‘Β / πŸ‘Ž.

@tjx666 tjx666 closed this May 1, 2026
@tjx666 tjx666 deleted the refactor/lobe-agent-tool branch May 1, 2026 10:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature:agent-builder Agent builder feature:tool Tool calling and function execution feature:vision size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant