Skip to content

🔨 feat(db): add llm_generation_tracing and agent eval experiment tables#15126

Merged
arvinxx merged 1 commit into
canaryfrom
chore/db-migration-tracing-and-eval
May 22, 2026
Merged

🔨 feat(db): add llm_generation_tracing and agent eval experiment tables#15126
arvinxx merged 1 commit into
canaryfrom
chore/db-migration-tracing-and-eval

Conversation

@arvinxx

@arvinxx arvinxx commented May 22, 2026

Copy link
Copy Markdown
Member

💻 Change Type

  • 🔨 chore

🔀 Description of Change

把两条互相冲突的 schema 工作合并到一个 0103 migration 里,避免迁移号撞车:

  • llm_generation_tracing — 新表,记录 generateObject / 结构化生成调用的 trace 数据(scenario、prompt 版本、输入 hash、用量、反馈信号等)。
  • agent_eval_experiments + agent_eval_experiment_benchmarks — 来自 🔨 chore: add agent eval experiment schema #14990,引入 experiment 概念以及它与 benchmark 的多对多关系。
  • 对应给 agent_eval_datasetssource_experiment_id、给 agent_eval_runsexperiment_id / parent_run_id,并补齐 FK + 索引。
  • agent_eval_experiment_benchmarks 上额外加了 user_id(FK → users,onDelete cascade)+ 索引,让关联表按用户隔离,与 agent_eval_run_topics 同款 pattern。

外键策略(沿用 #14990 的设计)

外键 设置 原因
agent_eval_experiment_benchmarks.experiment_idagent_eval_experiments.id ON DELETE CASCADE experiment 删除时关联关系应一起消失
agent_eval_experiment_benchmarks.benchmark_idagent_eval_benchmarks.id ON DELETE CASCADE benchmark 删除后关联行没有独立价值
agent_eval_experiment_benchmarks.user_idusers.id ON DELETE CASCADE 用户注销时清理掉所属关联
agent_eval_datasets.source_experiment_idagent_eval_experiments.id ON DELETE SET NULL "来源"字段非所有权;experiment 没了 dataset 仍应保留
agent_eval_runs.experiment_idagent_eval_experiments.id ON DELETE NO ACTION run 是历史快照,不应随 experiment 被抹掉
agent_eval_runs.parent_run_idagent_eval_runs.id ON DELETE NO ACTION 父子 run 是演进链,不能自动断链

索引说明

  • experiment_id 不再单独建索引:作为 (experiment_id, benchmark_id) 复合主键的最左列已被覆盖。
  • benchmark_id / user_id 因为不在 PK 最左前缀,单独建 btree 索引。

Migration 幂等性

所有 SQL 已包装 IF NOT EXISTS / DROP CONSTRAINT IF EXISTS + ADD CONSTRAINT,可重复执行。

⚠️ Conflict Note

这个 PR 与 #14990 共用 0103 槽位 —— 任一合并后,另一个需要重新规划迁移号 / 选择性合并。当前预期是把 #14990 替换/吸收进本 PR。

🧪 How to Test

  • bun run db:generate 输出一致,无 drift
  • 在干净数据库上执行 0103 migration 通过
  • 回滚 → 重跑 migration 无报错(幂等性)

📝 Additional Information

参见相关 schema:packages/database/src/schemas/llmGenerationTracing.tspackages/database/src/schemas/agentEvals.ts

🤖 Generated with Claude Code

… tables into 0103

Merges the schema work from #14990 with the new llm_generation_tracing
table into a single idempotent 0103 migration so the two streams can
land together without a migration-number conflict.

Also adds user_id (FK + index) to agent_eval_experiment_benchmarks so
the junction table is scoped per user, matching agent_eval_run_topics.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@arvinxx arvinxx requested review from nekomeowww and tjx666 as code owners May 22, 2026 15:42
@vercel

vercel Bot commented May 22, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
lobehub Ready Ready Preview, Comment May 22, 2026 3:56pm

Request Review

@dosubot dosubot Bot added the size:L This PR changes 100-499 lines, ignoring generated files. label May 22, 2026

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @arvinxx, you have reached your weekly rate limit of 500000 diff characters.

Please try again later or upgrade to continue using Sourcery

@arvinxx arvinxx changed the title 🔨 chore(db): combine llm_generation_tracing + agent eval experiment tables into 0103 🔨 feat(db): combine llm_generation_tracing + agent eval experiment tables into 0103 May 22, 2026
@arvinxx arvinxx changed the title 🔨 feat(db): combine llm_generation_tracing + agent eval experiment tables into 0103 🔨 feat(db): add llm_generation_tracing and agent eval experiment tables May 22, 2026
@codecov

codecov Bot commented May 22, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 70.67%. Comparing base (55452cd) to head (a45d064).
⚠️ Report is 2 commits behind head on canary.

Additional details and impacted files
@@            Coverage Diff            @@
##           canary   #15126     +/-   ##
=========================================
  Coverage   70.67%   70.67%             
=========================================
  Files        3127     3127             
  Lines      311124   311124             
  Branches    32784    27185   -5599     
=========================================
  Hits       219901   219901             
  Misses      91058    91058             
  Partials      165      165             
Flag Coverage Δ
app 61.36% <ø> (ø)
database 92.17% <ø> (ø)
packages/agent-runtime 80.48% <ø> (ø)
packages/builtin-tool-lobe-agent 19.87% <ø> (ø)
packages/context-engine 84.13% <ø> (ø)
packages/conversation-flow 91.28% <ø> (ø)
packages/file-loaders 87.89% <ø> (ø)
packages/memory-user-memory 75.01% <ø> (ø)
packages/model-bank 99.99% <ø> (ø)
packages/model-runtime 83.79% <ø> (ø)
packages/prompts 71.60% <ø> (ø)
packages/python-interpreter 92.90% <ø> (ø)
packages/ssrf-safe-fetch 0.00% <ø> (ø)
packages/types 35.06% <ø> (ø)
packages/utils 88.02% <ø> (ø)
packages/web-crawler 88.08% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
Store 67.93% <ø> (ø)
Services 54.49% <ø> (ø)
Server 72.09% <ø> (ø)
Libs 56.44% <ø> (ø)
Utils 85.42% <ø> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@arvinxx arvinxx merged commit b01e4dc into canary May 22, 2026
39 of 40 checks passed
@arvinxx arvinxx deleted the chore/db-migration-tracing-and-eval branch May 22, 2026 16:05
@arvinxx arvinxx mentioned this pull request May 28, 2026
arvinxx added a commit that referenced this pull request May 29, 2026
# 🚀 LobeHub Release (20260528)

**Release Date:** May 28, 2026  
**Since v2.2.0:** 220 merged PRs · 15 contributors

> This cycle brings heterogeneous "platform agents" you can dispatch to
local or remote devices, a rebuilt onboarding flow, document-centric
chat, and a unified model-runtime error model — with new DeepSeek V4 and
Gemini 3.5 Flash support along the way.

---

## ✨ Highlights

- **More Hetero Agents (OpenClaw / Hermes)** — Create heterogeneous
agents and dispatch them to local or remote devices through the device
gateway, with an execution-target switcher in the composer and
persistent CLI sessions. (#15065, #15179, #15022)
- **iMessage on Desktop** — New iMessage setup and bridge on desktop,
plus bot attachments across every platform. (#15228, #15227, #15029)
- **Skills in the Composer** — Drag skill chips into chat, trigger
installed skills from the slash menu mid-line, and surface project-level
skills in the homogeneous agent runtime. (#15095, #15061, #15110)
- **New Models** — DeepSeek V4 Flash/Pro and Gemini 3.5 Flash across
providers, with thinking params for structured output and chat cost
estimates. (#15031, #15001, #15051, #14876)
- **Agent Runtime Observability** — OpenTelemetry GenAI semantic
conventions plus per-call generation tracing. (#15123, #15124)

---

## 🤖 Agents & Heterogeneous Runtime

- **Platform agent creation** — OpenClaw/Hermes creation UI, device
guard, and remote dispatch backend. (#15065)
- **Execution-target switcher** — Pick local vs remote execution
directly in the composer; device-selection UX with actionable guidance.
(#15179, #15111)
- **CLI hetero dispatch** — OpenClaw/Hermes dispatch with persistent
sessions and a notify protocol. (#15022)
- **Gateway snapshot as source of truth** — Consume the gateway
`uiMessages` snapshot at step boundaries to keep chat state consistent.
(#15153, #15152)
- **Client sub-agent as a normal tool call** — Simplifies the sub-agent
execution path. (#15281)
- **Hermes agent chain** — Implements the Hermes agent chain logic.
(#15189)
- **Device registry** — TRPC endpoints to register, list, update, and
remove devices. (#15299)
- **Desktop device routing** — Route gateway agent runs through `lh
hetero exec`; restore `userId` in gateway dispatch and gate local-system
by execution target. (#15132, #15232)
- **Agent signals** — Anchor agent-signal receipts to messages and
isolate memory-agent messages into a child thread. (#14969, #14921)

---

## 🚀 Onboarding

- **Simplified first screen** — Defer topic creation to first send.
(#15090)
- **Market Agent Picker** — Added as a classic onboarding step, with
template prefetch. (#14980, #15041)
- **Welcome guidance** — Show agent welcome guidance on first run.
(#15098)
- **Mobile** — Adapt agent onboarding UI and restore Classic-step
padding on mobile. (#15019, #15032)
- **Discovery** — Streamline discovery to a single profession question.
(#14987)
- **Analytics** — Track onboarding step events and create-agent modal
source. (#15133, #15028)

---

## 📄 Documents, Pages & Knowledge

- **Thread chat in preview** — Embed thread chat in the document preview
portal. (#15216)
- **Non-markdown rendering** — Render non-markdown docs as a read-only
highlight. (#15272)
- **Multi-select** — Multi-select delete in the document tree. (#15125)
- **Page-agent streaming** — Preview `initPage` streaming arguments.
(#15039)
- **Per-agent topics** — Per-agent topic management page. (#15207)
- **Server-side category** — Derive document category server-side and
drop frontend predicates. (#15076)

---

## 🧩 Skills & Tools

- **Drag skill chips** — Drag skills into chat input and register
agent-document skills. (#15095)
- **Slash menu** — Installed skills appear in the slash menu with a
mid-line trigger. (#15061)
- **Project skills** — Recognize project-level skills in the homogeneous
agent runtime and surface them regardless of active device. (#15110,
#15177)
- **VFS archiving** — Archive oversized tool results to VFS instead of
truncating. (#15074)
- **@localfile mentions** — Drag folders into chat input as `@localFile`
mentions on desktop. (#15071)

---

## 🧠 Model Runtime & Providers

- **Error spec registry** — Unify error codes into a spec + pattern
registry, split `ProviderBizError` into finer codes, classify Cloud-only
codes via a tier digit, and add `DatabasePersistError`. (#15262, #15286,
#15278, #15279)
- **New models** — DeepSeek V4 Flash/Pro (opencode-go) and Gemini 3.5
Flash; DeepSeek V4 Pro on SiliconCloud. (#15031, #15001, #15017, #15267)
- **Structured output** — Thinking params for structured output, Bedrock
structured generation, and DeepSeek `generateObject` tool choice.
(#15051, #15174, #15054)
- **Cost** — Chat cost estimate support; preserve usage cost in custom
streams. (#14876, #15218)

---

## 💬 Chat & User Experience

- **Follow-up chips** — Extend follow-up chip suggestions to general
chat with scene-specific model config. (#15101, #14797)
- **Input drafts** — Persist unsent input drafts across tab switches and
prevent repeated draft restore. (#14992, #15024)
- **Command menu** — Order topic/message search by recency and promote
inline type filters. (#15094, #14986)
- **Zoom HUD** — Show a zoom-level HUD on Cmd +/− and Cmd 0. (#15294)
- **Copy** — Unescape markdown escapes when copying user messages.
(#15253)

---

## 🖥️ Desktop

- **App Nap fix** — Prevent App Nap from dropping the gateway WebSocket
during display sleep. (#14994)
- **File preview** — Preview `.cjs`/`.mjs`/no-extension files instead of
binary fallback and expand `~` when opening local files. (#15168,
#15284)
- **Cross-platform settings** — Open settings via main-window navigation
on Windows/Linux and restore the route after an update restart. (#15036,
#14922)
- **Token refresh** — Prevent frequent logout from token-refresh
retries. (#14928)

---

## 📊 Observability

- **OTel GenAI** — Instrument Agent Runtime with OpenTelemetry GenAI
semantic conventions. (#15123)
- **Generation tracing** — Per-call `llm_generation_tracing` with a
pre-allocated tracingId and recordFeedback router. (#15124, #15146)
- **Error classification** — Persist `ERROR_CODE_SPECS` classification
on operation errors. (#15273)

---

## 🗃️ Database Migrations

- **Batch migrations** — Topic usage stats, push tokens,
`tasks.editor_data`, and document shares. (#15280)
- **Tracing & eval tables** — Add `llm_generation_tracing` and agent
eval experiment tables. (#15126)

> Self-hosted operators should run the database migration (`pnpm
db:migrate`, or restart with auto-migrate enabled) after upgrading. The
changes are additive and backwards-compatible.

---

## 🔒 Security & Reliability

- **Security:** Remove the `getPlaintextCred` tool to prevent plaintext
credential exposure. (#14998)
- **Security:** Prompt account selection for Google OAuth and add
`prompt=consent` to the OIDC authorization URL to fix missing refresh
tokens. (#15234, #15010)
- **Reliability:** Preserve streamed content across a mid-stream cancel.
(#15173)
- **Reliability:** Bound the Redis command timeout and configure the
Anthropic client timeout. (#15091, #15042)
- **Reliability:** Prevent infinite recursion in the assistant chain.
(#15288)

---

## 👥 Contributors

Huge thanks to **15 contributors** who shipped **220 merged PRs** this
cycle.

@AnotiaWang · @sxjeru · @algojogacor · @hardy-one · @arvinxx · @Innei ·
@tjx666 · @lijian · @AmAzing129 · @rdmclin2 · @neko · @cy948 ·
@CanisMinor · @sudongyuer · @rivertwilight

Plus @lobehubbot and renovate[bot] for maintenance.

---

**Full Changelog**: v2.2.0...release/weekly-20260528
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant