feat(bot):Add eval function(support locomo, skillsbench), open add-resource tool, add feishu progress notification capability by yeshion23333 · Pull Request #506 · volcengine/OpenViking

yeshion23333 · 2026-03-10T04:14:50Z

Description

✅ Added

Complete SkillsBench automated evaluation tool, supporting benchmark data preparation, batch task execution, automatic result verification and pass rate statistics / 完整的SkillsBench自动化评测工具，支持基准数据准备、批量任务执行、自动结果验证、通过率统计
Locomo evaluation result statistics script, automatically generates accuracy, time cost, token consumption report / locomo评测结果统计脚本，自动生成准确率、耗时、Token消耗报告
Long-running task progress notification capability, Feishu channel dynamically displays processing emoji feedback / 长任务进度提示能力，飞书端动态展示处理中表情反馈
VikingAddResourceTool supporting both URL resources (images, Git repositories) and local file addition / VikingAddResourceTool工具，支持URL资源（图片、Git仓库）和本地文件添加
Channel-level memory sharing mode configuration, supporting memory isolation/sharing switching / 频道级内存共享模式配置，支持内存隔离/共享切换
CLI chat command --config/-c parameter, supporting custom configuration file path / CLI chat命令--config/-c参数，支持自定义配置文件路径
Memory search performance logging for performance troubleshooting / 内存搜索性能日志，便于性能问题排查

🔄 Changed

Locomo evaluation script optimization: supports resume evaluation, writes results to independent output file, --token changed to required parameter, updated default judge model / Locomo评测脚本优化：支持断点续评、结果写入独立输出文件、--token改为必填参数、更新默认评分模型
Single-turn request timeout increased from 300s to 3000s to adapt to long-running evaluation tasks / 单轮请求超时从300s延长至3000s，适配长耗时评测任务
Resource search/Grep tool logic optimized to adapt to the latest OV server API / 资源搜索/Grep工具逻辑优化，适配最新OV服务端接口
Feishu channel message processing logic refactored, prioritizes processing of generic metadata actions / 飞书频道消息处理逻辑重构，优先处理通用元数据动作
Configuration loading logic optimized to support custom configuration path input / 配置加载逻辑优化，支持自定义配置路径传入

🐛 Fixed

Fixed null pointer exception when fieldnames is None during CSV reading / 修复CSV读取时fieldnames为None的空指针异常
Fixed Feishu thread message reply logic, uses thread ID only in thread mode / 修复飞书线程消息回复逻辑，仅在线程模式下使用线程ID
Fixed non-standard JSON parsing failure issue, uses strict=False for compatible processing / 修复非标准JSON解析失败问题，使用strict=False兼容处理
Fixed AddResource timeout exception issue, returns task submission success prompt even on timeout / 修复AddResource超时异常问题，超时仍返回任务提交成功提示

❌ Removed

Removed redundant custom JSON preprocessing logic in run_eval.py / 移除run_eval.py中冗余的自定义JSON预处理逻辑
Removed built-in @mention parsing logic in Feishu channel / 移除飞书频道内置的@提及解析逻辑
Removed deprecated target_path and wait parameters in AddResource tool / 移除AddResource工具中废弃的target_path和wait参数
Removed logic that directly modifies input files in judge.py, changed to write to independent output file / 移除jduge.py中直接修改输入文件的逻辑，改为写入独立输出文件

Related Issue

Type of Change

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change that adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update
Refactoring (no functional changes)
Performance improvement
Test update

Changes Made

Testing

I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I have tested this on the following platforms:
- Linux
- macOS
- Windows

Checklist

My code follows the project's coding style
I have performed a self-review of my code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
Any dependent changes have been merged and published

Screenshots (if applicable)

Additional Notes

2. opt single turn;

# Conflicts: # bot/vikingbot/agent/loop.py # bot/vikingbot/agent/memory.py # bot/vikingbot/hooks/builtins/openviking_hooks.py

# Conflicts: # bot/vikingbot/agent/loop.py # bot/vikingbot/agent/tools/ov_file.py # bot/vikingbot/hooks/builtins/openviking_hooks.py

bot/eval/skillsbench/skill_bench_eval.py

bot/eval/locomo/judge.py

bot/eval/skillsbench/skill_bench_eval.py

bot/vikingbot/agent/tools/ov_file.py

qin-ctx · 2026-03-10T11:14:31Z

bot/vikingbot/config/loader.py

 from loguru import logger
 from vikingbot.config.schema import Config

+CONFIG_PATH = None


[Suggestion] 使用模块级全局可变变量 CONFIG_PATH 在 ensure_config() 和 load_config() 之间传递配置路径是比较脆弱的模式。多处代码（如 hooks）直接调用 load_config() 而不经过 ensure_config()，此时 CONFIG_PATH 为 None，会 fallback 到默认路径而非 CLI 传入的自定义路径。

考虑使用更显式的方式，例如将 config 对象缓存为单例，或在 load_config() 中保留 config_path 参数。

bot/vikingbot/agent/tools/ov_file.py

yeshion23333 added 30 commits March 5, 2026 20:01

1. add skill eval;

07bbe61

2. opt single turn;

add skills bench eval

45e5bb5

add skills bench eval

654b64b

add skills bench eval

c9146cc

add skills bench eval

5bafe04

add skills bench eval

df587df

add skills bench eval

9c0ce58

add tool

bcbc11d

add tool

a798efb

add tool

515e674

add tool

fd325eb

feishu @

546ba4f

add resource

ad07fd3

add resource

6e9439d

add resource

cda8e25

add resource

a2bddac

add resource

ca16452

add resource

ea03f7f

add feishu

aa1c9b2

feishu emoji

6295e4c

feishu emoji

58376d6

feishu emoji

cd69ae5

feishu emoji

b8915c3

memory

3734032

eval

edcce0b

eval

49c150e

config path

7ae57f5

Merge remote-tracking branch 'origin/main' into feature/eval

c988d0b

# Conflicts: # bot/vikingbot/agent/loop.py # bot/vikingbot/agent/memory.py # bot/vikingbot/hooks/builtins/openviking_hooks.py

config path

033dbf9

msg add opt

3b7a929

github-project-automation bot added this to OpenViking project Mar 10, 2026

github-project-automation bot moved this to Backlog in OpenViking project Mar 10, 2026

yeshion23333 requested review from MaojiaSheng and qin-ctx March 10, 2026 04:15

yeshion23333 added 3 commits March 10, 2026 17:20

Merge remote-tracking branch 'origin/main' into feature/eval

78ec62e

# Conflicts: # bot/vikingbot/agent/loop.py # bot/vikingbot/agent/tools/ov_file.py # bot/vikingbot/hooks/builtins/openviking_hooks.py

fix message send

326f6fe

type

fd1b0c8

qin-ctx reviewed Mar 10, 2026

View reviewed changes

fix comment

c101054

qin-ctx reviewed Mar 10, 2026

View reviewed changes

bot/vikingbot/agent/tools/ov_file.py Outdated Show resolved Hide resolved

fix comment

16c7141

MaojiaSheng approved these changes Mar 10, 2026

View reviewed changes

MaojiaSheng merged commit a75f1ac into main Mar 10, 2026
5 of 6 checks passed

MaojiaSheng deleted the feature/eval branch March 10, 2026 15:02

github-project-automation bot moved this from Backlog to Done in OpenViking project Mar 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(bot):Add eval function(support locomo, skillsbench), open add-resource tool, add feishu progress notification capability#506

feat(bot):Add eval function(support locomo, skillsbench), open add-resource tool, add feishu progress notification capability#506
MaojiaSheng merged 35 commits intomainfrom
feature/eval

yeshion23333 commented Mar 10, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

qin-ctx Mar 10, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

yeshion23333 commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

✅ Added

🔄 Changed

🐛 Fixed

❌ Removed

Related Issue

Type of Change

Changes Made

Testing

Checklist

Screenshots (if applicable)

Additional Notes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

qin-ctx Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yeshion23333 commented Mar 10, 2026 •

edited

Loading