🔨 feat(db): add llm_generation_tracing and agent eval experiment tables#15126
Merged
Conversation
… tables into 0103 Merges the schema work from #14990 with the new llm_generation_tracing table into a single idempotent 0103 migration so the two streams can land together without a migration-number conflict. Also adds user_id (FK + index) to agent_eval_experiment_benchmarks so the junction table is scoped per user, matching agent_eval_run_topics. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
llm_generation_tracing and agent eval experiment tables
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## canary #15126 +/- ##
=========================================
Coverage 70.67% 70.67%
=========================================
Files 3127 3127
Lines 311124 311124
Branches 32784 27185 -5599
=========================================
Hits 219901 219901
Misses 91058 91058
Partials 165 165
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Merged
arvinxx
added a commit
that referenced
this pull request
May 29, 2026
# 🚀 LobeHub Release (20260528) **Release Date:** May 28, 2026 **Since v2.2.0:** 220 merged PRs · 15 contributors > This cycle brings heterogeneous "platform agents" you can dispatch to local or remote devices, a rebuilt onboarding flow, document-centric chat, and a unified model-runtime error model — with new DeepSeek V4 and Gemini 3.5 Flash support along the way. --- ## ✨ Highlights - **More Hetero Agents (OpenClaw / Hermes)** — Create heterogeneous agents and dispatch them to local or remote devices through the device gateway, with an execution-target switcher in the composer and persistent CLI sessions. (#15065, #15179, #15022) - **iMessage on Desktop** — New iMessage setup and bridge on desktop, plus bot attachments across every platform. (#15228, #15227, #15029) - **Skills in the Composer** — Drag skill chips into chat, trigger installed skills from the slash menu mid-line, and surface project-level skills in the homogeneous agent runtime. (#15095, #15061, #15110) - **New Models** — DeepSeek V4 Flash/Pro and Gemini 3.5 Flash across providers, with thinking params for structured output and chat cost estimates. (#15031, #15001, #15051, #14876) - **Agent Runtime Observability** — OpenTelemetry GenAI semantic conventions plus per-call generation tracing. (#15123, #15124) --- ## 🤖 Agents & Heterogeneous Runtime - **Platform agent creation** — OpenClaw/Hermes creation UI, device guard, and remote dispatch backend. (#15065) - **Execution-target switcher** — Pick local vs remote execution directly in the composer; device-selection UX with actionable guidance. (#15179, #15111) - **CLI hetero dispatch** — OpenClaw/Hermes dispatch with persistent sessions and a notify protocol. (#15022) - **Gateway snapshot as source of truth** — Consume the gateway `uiMessages` snapshot at step boundaries to keep chat state consistent. (#15153, #15152) - **Client sub-agent as a normal tool call** — Simplifies the sub-agent execution path. (#15281) - **Hermes agent chain** — Implements the Hermes agent chain logic. (#15189) - **Device registry** — TRPC endpoints to register, list, update, and remove devices. (#15299) - **Desktop device routing** — Route gateway agent runs through `lh hetero exec`; restore `userId` in gateway dispatch and gate local-system by execution target. (#15132, #15232) - **Agent signals** — Anchor agent-signal receipts to messages and isolate memory-agent messages into a child thread. (#14969, #14921) --- ## 🚀 Onboarding - **Simplified first screen** — Defer topic creation to first send. (#15090) - **Market Agent Picker** — Added as a classic onboarding step, with template prefetch. (#14980, #15041) - **Welcome guidance** — Show agent welcome guidance on first run. (#15098) - **Mobile** — Adapt agent onboarding UI and restore Classic-step padding on mobile. (#15019, #15032) - **Discovery** — Streamline discovery to a single profession question. (#14987) - **Analytics** — Track onboarding step events and create-agent modal source. (#15133, #15028) --- ## 📄 Documents, Pages & Knowledge - **Thread chat in preview** — Embed thread chat in the document preview portal. (#15216) - **Non-markdown rendering** — Render non-markdown docs as a read-only highlight. (#15272) - **Multi-select** — Multi-select delete in the document tree. (#15125) - **Page-agent streaming** — Preview `initPage` streaming arguments. (#15039) - **Per-agent topics** — Per-agent topic management page. (#15207) - **Server-side category** — Derive document category server-side and drop frontend predicates. (#15076) --- ## 🧩 Skills & Tools - **Drag skill chips** — Drag skills into chat input and register agent-document skills. (#15095) - **Slash menu** — Installed skills appear in the slash menu with a mid-line trigger. (#15061) - **Project skills** — Recognize project-level skills in the homogeneous agent runtime and surface them regardless of active device. (#15110, #15177) - **VFS archiving** — Archive oversized tool results to VFS instead of truncating. (#15074) - **@localfile mentions** — Drag folders into chat input as `@localFile` mentions on desktop. (#15071) --- ## 🧠 Model Runtime & Providers - **Error spec registry** — Unify error codes into a spec + pattern registry, split `ProviderBizError` into finer codes, classify Cloud-only codes via a tier digit, and add `DatabasePersistError`. (#15262, #15286, #15278, #15279) - **New models** — DeepSeek V4 Flash/Pro (opencode-go) and Gemini 3.5 Flash; DeepSeek V4 Pro on SiliconCloud. (#15031, #15001, #15017, #15267) - **Structured output** — Thinking params for structured output, Bedrock structured generation, and DeepSeek `generateObject` tool choice. (#15051, #15174, #15054) - **Cost** — Chat cost estimate support; preserve usage cost in custom streams. (#14876, #15218) --- ## 💬 Chat & User Experience - **Follow-up chips** — Extend follow-up chip suggestions to general chat with scene-specific model config. (#15101, #14797) - **Input drafts** — Persist unsent input drafts across tab switches and prevent repeated draft restore. (#14992, #15024) - **Command menu** — Order topic/message search by recency and promote inline type filters. (#15094, #14986) - **Zoom HUD** — Show a zoom-level HUD on Cmd +/− and Cmd 0. (#15294) - **Copy** — Unescape markdown escapes when copying user messages. (#15253) --- ## 🖥️ Desktop - **App Nap fix** — Prevent App Nap from dropping the gateway WebSocket during display sleep. (#14994) - **File preview** — Preview `.cjs`/`.mjs`/no-extension files instead of binary fallback and expand `~` when opening local files. (#15168, #15284) - **Cross-platform settings** — Open settings via main-window navigation on Windows/Linux and restore the route after an update restart. (#15036, #14922) - **Token refresh** — Prevent frequent logout from token-refresh retries. (#14928) --- ## 📊 Observability - **OTel GenAI** — Instrument Agent Runtime with OpenTelemetry GenAI semantic conventions. (#15123) - **Generation tracing** — Per-call `llm_generation_tracing` with a pre-allocated tracingId and recordFeedback router. (#15124, #15146) - **Error classification** — Persist `ERROR_CODE_SPECS` classification on operation errors. (#15273) --- ## 🗃️ Database Migrations - **Batch migrations** — Topic usage stats, push tokens, `tasks.editor_data`, and document shares. (#15280) - **Tracing & eval tables** — Add `llm_generation_tracing` and agent eval experiment tables. (#15126) > Self-hosted operators should run the database migration (`pnpm db:migrate`, or restart with auto-migrate enabled) after upgrading. The changes are additive and backwards-compatible. --- ## 🔒 Security & Reliability - **Security:** Remove the `getPlaintextCred` tool to prevent plaintext credential exposure. (#14998) - **Security:** Prompt account selection for Google OAuth and add `prompt=consent` to the OIDC authorization URL to fix missing refresh tokens. (#15234, #15010) - **Reliability:** Preserve streamed content across a mid-stream cancel. (#15173) - **Reliability:** Bound the Redis command timeout and configure the Anthropic client timeout. (#15091, #15042) - **Reliability:** Prevent infinite recursion in the assistant chain. (#15288) --- ## 👥 Contributors Huge thanks to **15 contributors** who shipped **220 merged PRs** this cycle. @AnotiaWang · @sxjeru · @algojogacor · @hardy-one · @arvinxx · @Innei · @tjx666 · @lijian · @AmAzing129 · @rdmclin2 · @neko · @cy948 · @CanisMinor · @sudongyuer · @rivertwilight Plus @lobehubbot and renovate[bot] for maintenance. --- **Full Changelog**: v2.2.0...release/weekly-20260528
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
💻 Change Type
🔀 Description of Change
把两条互相冲突的 schema 工作合并到一个
0103migration 里,避免迁移号撞车:llm_generation_tracing— 新表,记录generateObject/ 结构化生成调用的 trace 数据(scenario、prompt 版本、输入 hash、用量、反馈信号等)。agent_eval_experiments+agent_eval_experiment_benchmarks— 来自 🔨 chore: add agent eval experiment schema #14990,引入 experiment 概念以及它与 benchmark 的多对多关系。agent_eval_datasets加source_experiment_id、给agent_eval_runs加experiment_id/parent_run_id,并补齐 FK + 索引。agent_eval_experiment_benchmarks上额外加了user_id(FK → users,onDelete cascade)+ 索引,让关联表按用户隔离,与agent_eval_run_topics同款 pattern。外键策略(沿用 #14990 的设计)
agent_eval_experiment_benchmarks.experiment_id→agent_eval_experiments.idON DELETE CASCADEagent_eval_experiment_benchmarks.benchmark_id→agent_eval_benchmarks.idON DELETE CASCADEagent_eval_experiment_benchmarks.user_id→users.idON DELETE CASCADEagent_eval_datasets.source_experiment_id→agent_eval_experiments.idON DELETE SET NULLagent_eval_runs.experiment_id→agent_eval_experiments.idON DELETE NO ACTIONagent_eval_runs.parent_run_id→agent_eval_runs.idON DELETE NO ACTION索引说明
experiment_id不再单独建索引:作为(experiment_id, benchmark_id)复合主键的最左列已被覆盖。benchmark_id/user_id因为不在 PK 最左前缀,单独建 btree 索引。Migration 幂等性
所有 SQL 已包装
IF NOT EXISTS/DROP CONSTRAINT IF EXISTS+ADD CONSTRAINT,可重复执行。这个 PR 与 #14990 共用
0103槽位 —— 任一合并后,另一个需要重新规划迁移号 / 选择性合并。当前预期是把 #14990 替换/吸收进本 PR。🧪 How to Test
bun run db:generate输出一致,无 drift0103migration 通过📝 Additional Information
参见相关 schema:
packages/database/src/schemas/llmGenerationTracing.ts、packages/database/src/schemas/agentEvals.ts。🤖 Generated with Claude Code