✨ feat: per-call llm_generation_tracing observability by arvinxx · Pull Request #15124 · lobehub/lobehub

arvinxx · 2026-05-22T15:15:39Z

Summary

Implements PR-1 (foundation) + PR-2 (interception + service) of LOBE-9462 — per-call observability for every generateObject call. Cloud zero-change; one wiring point in initModelRuntimeFromDB covers all OSS callers.

DB: new llm_generation_tracing table (uuid PK, full single-column index coverage, idempotent migration 0103), LlmGenerationTracingModel with record / updateFeedback / findById / listRecent (userId-scoped).
Package @lobechat/llm-generation-tracing: ITracingStore / FileTracingStore (scenario subfolders + latest.json symlink) / computePromptHash (6-char sha256) / TRACING_SCENARIO_REGISTRY + resolveScenario with explicit scenario override.
S3 store + Service: S3TracingStore (zstd-3, key llm-generation-tracing/{scenario}/{v}-{hash}/{date}/{id}.json.zst). LLMGenerationTracingService inserts DB row first → store blob → patches storage_key; store failures preserve the row with metadata.store_error.
Interception: new onGenerateObjectComplete hook on ModelRuntimeHooks (always fires, success or failure). createLLMGenerationTracingHook + mergeModelRuntimeHooks merged with business hooks in initModelRuntimeFromDB. Defers work via next/server.after() when available, microtask fallback otherwise. Unknown metadata keys (e.g. parent_memory_trace_key) pass through.
Memory backlink: extractor accepts parentMemoryTraceKey option → forwarded to runtime metadata so per-call rows can backlink to the job-level memory trace blob.
Caller fix: followUpAction now passes metadata: { scenario: 'follow_up' } (the only OSS caller previously missing trigger metadata).

Tests: 165/165 pass (5 DB model + 15 package + 5 S3 store + 12 service/hook + 119 ModelRuntime + 9 followUpAction).

Gating

The service is no-op unless explicitly enabled — OSS / self-hosted setups pay nothing for it.

ENABLE_LLM_GENERATION_TRACING_S3=1 → S3 store (requires S3 env)
ENABLE_LLM_GENERATION_TRACING_LOCAL=1 or NODE_ENV=development → FileTracingStore (writes to .llm-generation-tracing/)
Otherwise → no-op

Follow-ups (intentionally NOT in this PR)

agent_signal sub-scenario refinement — skillIntent / feedbackSatisfaction / feedbackDomainAgent / skillManagement currently fold into scenario='agent_signal'. Each caller can opt in to metadata.scenario = 'signal_skill_intent' etc. independently.
Memory job-level trace wiring (orchestrator side) — runtime accepts parentMemoryTraceKey but extract.ts still computes the trace key after the LLM call. Needs upfront key generation to pass through the option.
Feedback tRPC procedure + UI (PR-3) — out of scope for this PR per the design doc's plan.
Cloud 3 callers — confirmed not in OSS repo; cloud-side trigger audit is a separate PR if needed.

Test plan

CI tests pass
Set ENABLE_LLM_GENERATION_TRACING_LOCAL=1 locally, trigger a topic_title / follow_up / memory extraction, confirm .llm-generation-tracing/<scenario>/<v>-<hash>/<file>.json appears and a corresponding DB row exists
Verify no perf regression on generateObject callers (hook is fire-and-forget via after())
Set ENABLE_LLM_GENERATION_TRACING_S3=1 in staging, confirm S3 key layout matches llm-generation-tracing/{scenario}/{v}-{hash}/{date}/{id}.json.zst
Confirm storage_key is null and metadata.store_error populated when S3 is misconfigured (DB row should still land)

Closes LOBE-9462 (PR-1 + PR-2 scope; PR-3 feedback UI is a follow-up).

🤖 Generated with Claude Code

vercel · 2026-05-22T15:15:46Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
lobehub	Ready	Preview, Comment	May 23, 2026 10:07am

sourcery-ai

Sorry @arvinxx, you have reached your weekly rate limit of 500000 diff characters.

Please try again later or upgrade to continue using Sourcery

codecov · 2026-05-22T15:28:16Z

Codecov Report

❌ Patch coverage is 90.41812% with 55 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.76%. Comparing base (8cd03c8) to head (2d1d4c6).
⚠️ Report is 8 commits behind head on canary.

Additional details and impacted files

@@             Coverage Diff             @@
##           canary   #15124       +/-   ##
===========================================
- Coverage   82.45%   70.76%   -11.70%     
===========================================
  Files         310     3142     +2832     
  Lines       23140   312527   +289387     
  Branches     4500    28302    +23802     
===========================================
+ Hits        19080   221145   +202065     
- Misses       3955    91216    +87261     
- Partials      105      166       +61

Flag	Coverage Δ
app	`61.50% <87.22%> (?)`
database	`92.20% <100.00%> (?)`
packages/agent-runtime	`80.48% <ø> (?)`
packages/builtin-tool-lobe-agent	`19.87% <ø> (?)`
packages/context-engine	`84.13% <ø> (?)`
packages/conversation-flow	`91.28% <ø> (?)`
packages/file-loaders	`87.89% <ø> (ø)`
packages/memory-user-memory	`74.99% <66.66%> (?)`
packages/model-bank	`99.99% <ø> (?)`
packages/model-runtime	`83.79% <98.18%> (+0.03%)`	⬆️
packages/prompts	`72.54% <100.00%> (+0.94%)`	⬆️
packages/python-interpreter	`92.90% <ø> (?)`
packages/ssrf-safe-fetch	`0.00% <ø> (?)`
packages/types	`35.09% <100.00%> (?)`
packages/utils	`88.20% <ø> (?)`
packages/web-crawler	`88.08% <ø> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
Store	`67.94% <ø> (∅)`
Services	`54.49% <ø> (∅)`
Server	`72.17% <92.34%> (∅)`
Libs	`56.42% <ø> (∅)`
Utils	`85.96% <ø> (-7.51%)`	⬇️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

… (LOBE-9462) Foundation layer for per-call observability of `generateObject` calls. - New Drizzle table `llm_generation_tracing` with identity / context / model / result / usage / storage / feedback / audit columns and full single-column index coverage (Postgres bitmap-scan friendly). Migration 0103 is idempotent (CREATE TABLE/INDEX IF NOT EXISTS) for safe re-runs. - `LlmGenerationTracingModel` with `record` / `updateFeedback` / `findById` / `listRecent`, all userId-scoped to prevent cross-user leaks. - New package `@lobechat/llm-generation-tracing` mirroring agent-tracing's shape: `ITracingStore` interface, `FileTracingStore` (local/dev, scenario subfolders + latest.json symlink), `computePromptHash` (6-char sha256 of systemPrompt + schema), and `TRACING_SCENARIO_REGISTRY` + `resolveScenario` with explicit scenario override. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…generateObject (LOBE-9462) Per-call interception layer — one hook covers all generateObject callers. - New `onGenerateObjectComplete` hook on `ModelRuntimeHooks`: always fires (success or failure) with latency, usage, output/error. Fixes the gap where `onGenerateObjectFinal` only fires when the runtime invokes `onUsage`. - `S3TracingStore` (zstd level 3, key `llm-generation-tracing/{scenario}/{v}-{hash}/{date}/{id}.json.zst`) and `LLMGenerationTracingService` that does DB insert → store.save → patch storage_key. Store failures preserve the row with `metadata.store_error`. - `createLLMGenerationTracingHook` + `mergeModelRuntimeHooks` wired into `initModelRuntimeFromDB`; tracing runs alongside business (billing) hooks via `next/server.after()` when available, microtask fallback otherwise. Unknown metadata keys (e.g. `parent_memory_trace_key`) pass through. - Memory extractor accepts `parentMemoryTraceKey` option for the job-level backlink. Follow-up-action caller given an explicit `scenario: 'follow_up'` metadata override — it was the only OSS caller missing trigger metadata. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

….calls indexing The hook + service tests destructured `mock.calls[0][0]` and accessed nested fields, which tsgo flagged as TS2493 / TS18046 because `vi.fn()` defaults to a zero-arg signature. Add explicit type parameters to the mocks so tsgo can infer the call tuple, and cast `call.payload` at the access point. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

It's a generic utility for composing `ModelRuntimeHooks` instances — same import surface as `ModelRuntime` and the hooks interface — so it belongs alongside them rather than tucked under a server-side consumer. - New `packages/model-runtime/src/core/mergeHooks.ts` exports `mergeModelRuntimeHooks` and is re-exported from the package index. - Move the unit tests to `packages/model-runtime/src/core/mergeHooks.test.ts`, including a new case covering the "a throws → b is skipped" load-bearing semantics. - `src/server/services/llmGenerationTracing/hook.ts` drops the local copy and the consumer (`src/server/modules/ModelRuntime/index.ts`) imports from `@lobechat/model-runtime`. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…ot in a central table `promptVersion` was baked into `TRACING_SCENARIO_REGISTRY`, far from any prompt definition — editing a prompt + forgetting to bump the entry in a completely different file was an obvious foot-gun. - Registry is now `Record<string, string>` mapping trigger → scenario only; it's the stable concern that rarely changes. - `resolveScenario` always passes `promptVersion` through from the caller, defaulting to `UNKNOWN_PROMPT_VERSION` ('v0') when absent. - Each call site declares its own `*_PROMPT_VERSION` constant next to the prompt it describes. `followUpAction` ships the first one: `FOLLOW_UP_PROMPT_VERSION` in `prompts/index.ts`, threaded through `metadata.promptVersion` at the `generateObject` call. Other callers can add the same constant when they next touch their prompts. The 6-char prompt hash on the row still catches forgotten bumps. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…mplete call site Aligns input auto-complete with the FOLLOW_UP_PROMPT_VERSION convention so each prompt iteration is recordable as the chat-side tracing lands. - `INPUT_COMPLETION_PROMPT_VERSION = 'v1.0'` declared next to `chainInputCompletion` — bump together with the prompt body. - `fetchPresetTaskResult` accepts optional `metadata` and forwards it to `getChatCompletion`; the existing chat path already plumbs metadata to `ModelRuntime.chat` options. - `InputEditor` call site passes `{ scenario: 'input_completion', promptVersion }`. Note: `llm_generation_tracing` currently only fires from `onGenerateObjectComplete`. Input completion is a `chat` call, so this metadata is forward-looking until a chat-side tracing hook lands. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…ilence turbopack glob warning Turbopack's static analyzer treats `path.join(root, dyn1, dyn2)` as a multi-segment glob pattern and warned that it could match ~12k files in the project. Compose the relative subdir as a single string first, so `path.join` only sees one dynamic segment. Behavior unchanged — the resulting path is identical. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…for tracing Auto-complete is the first preset-task caller migrated to the structured- output path so it lands in `llm_generation_tracing` via the existing `onGenerateObjectComplete` hook. No new server hook, no global chat-side tracing. - `chainInputCompletion` now returns `{ messages, schema }` with a minimal `{ completion: string }` schema and a stable `INPUT_COMPLETION_SCHEMA_NAME` constant. JSON wrapping costs ~15-30 tokens against a 100-token completion budget — negligible for the observability win. - `StructureOutputSchema` / `StructureOutputParams` accept optional `metadata`; `aiChatRouter.outputJSON` merges caller metadata over the default trigger so `{ scenario, promptVersion, schemaName }` reach `ModelRuntime.generateObject` options unchanged. - `IStructureSchema.description` is now optional to match the zod schema — previously the TS type was stricter than runtime validation accepted. - `InputEditor` switches from `chatService.fetchPresetTaskResult` to `aiChatService.generateJSON`, reading `response.completion`. Streaming is dropped because auto-complete already buffers the full result before inserting; no UX change. - Reverts the unused `metadata` field that was added to `fetchPresetTaskResult` in the previous commit — no current caller needs it now that input completion uses the generateObject path. Bumps `INPUT_COMPLETION_PROMPT_VERSION` to v2.0 because the system prompt gained an "output the completion field" instruction. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…dance into a service Every server-side caller that produces structured output was repeating the same two-step ritual: `initModelRuntimeFromDB(...)` → `runtime.generateObject(payload, { metadata })`. `AiGenerationService` collapses it into one call so future cross-cutting concerns (default metadata, retry, observability hooks) have one place to land. - New `src/server/services/aiGeneration/index.ts` exposes `generateObject<T>(input, options)` and is unit-tested for provider resolution + payload/metadata pass-through. - `aiChatRouter.outputJSON` and `FollowUpActionService.extract` migrated to the service (other callers move organically when next touched). - Drops the unused `keyVaultsPayload` field from `StructureOutputParams` and the placeholder at the InputEditor call site — key vaults are server-resolved from DB, the client never supplies them. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…enerationService via trpc ctx - New `packages/const/src/llmGenerationTracing.ts` exports `TRACING_SCENARIOS` + `TracingScenario` type — the single directory where every known scenario name lives. Adds `@lobechat/const` as a workspace dep on llm-generation- tracing so `TRACING_SCENARIO_REGISTRY` can reference the same literals. - Callers (FollowUpActionService, InputEditor) replace `'follow_up'` / `'input_completion'` string literals with `TRACING_SCENARIOS.FollowUp` / `.InputCompletion`, so a typo or a rename fails the type-check instead of silently drifting on the row. - `AiGenerationService` is now injected into the `aiChatProcedure` ctx middleware alongside `aiChatService`; `outputJSON` consumes it via `ctx.aiGenerationService` instead of new-ing it inside the handler. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…nly storage_key - Add `lt` / `llm-tracing` CLI under @lobechat/llm-generation-tracing with `list` (recent records, --scenario filter, --json) and `inspect` (by tracing_id prefix or latest, --full, --json). - `FileTracingStore.save` now returns `{ key: null }` so dev DB rows leave `storage_key` empty instead of recording a non-resolvable local path; S3 store remains the source of truth for the real key. Add helpers `findByTracingId` / `getLatest` used by the CLI. - Wire `agentId` and `topicId` into `input_completion` tracing metadata from the chat input auto-complete call site. - Default `FileTracingStore` whenever NODE_ENV=development (drop the ENABLE_LLM_GENERATION_TRACING_LOCAL opt-in env var). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Mirror the @lobechat/agent-tracing viewer style: - Inline ANSI color helpers (dim/bold/cyan/magenta/green/yellow/red). - Compact single-line header with id, scenario, version, model, status, time — replaces the multi-line bullet list. - Tree structure with `├─`/`└─` connectors instead of `── section ──` banners. - input arrays render per-message (role + char count + preview) rather than dumping raw JSON. - Small single-key outputs (e.g. `{ completion: "怎么样" }`) collapse to inline `key: "value"`. - `lt list` switches to a colored, properly padded table. Default view stays compact; --full expands system_prompt / input / schema bodies. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…etadata` `options.metadata` was overloaded — half tracing-specific structured fields (scenario / promptVersion / schemaName / agentId / topicId / ...), half free-form jsonb passthrough. Callers couldn't tell which was which, and the inputHint was always auto-extracted (useless when the prompt wraps the user's text in a template). This commit introduces a dedicated `tracing` option: - Add `TracingOptions` to @lobechat/llm-generation-tracing — the typed shape callers import (agentId / topicId / inputHint / scenario / promptVersion / schemaName / systemPrompt / parentTracingId / metadata). - Add loose `tracing?: Record<string, unknown>` to GenerateObjectOptions and StructureOutputParams / StructureOutputSchema so the field flows through the runtime + TRPC. - Tracing hook now reads `context.options.tracing` for structured fields; it still falls back to `metadata.trigger` for the cross-cutting trigger string (ModelRuntime itself uses metadata.trigger for timing logs, so trigger stays on metadata). - Service `record()` accepts an explicit `inputHint`; otherwise falls back to auto-extraction from the first user message. Always truncated. - Free-form jsonb fields move to `tracing.metadata` (was unknown-key passthrough on `metadata`). - Call sites updated: - FollowUpAction now passes `tracing: { scenario, promptVersion, schemaName, topicId }` (previously `metadata`). - InputCompletion now passes `tracing: { agentId, topicId, inputHint: input, scenario, promptVersion, schemaName }` — `inputHint` is the user's actual typed text, not the wrapper prompt's first user message. - `aiChat.outputJSON` router forwards both metadata and tracing. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…'s metadata jsonb `provider` is already a first-class column on the `llm_generation_tracing` row, so auto-stamping it into the `metadata` jsonb column on every call was pure noise. The hook now writes the caller-supplied `tracing.metadata` verbatim — empty/undefined when the caller had nothing to add. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* ✨ feat(database): add llm_generation_tracing schema + tracing package (LOBE-9462) Foundation layer for per-call observability of `generateObject` calls. - New Drizzle table `llm_generation_tracing` with identity / context / model / result / usage / storage / feedback / audit columns and full single-column index coverage (Postgres bitmap-scan friendly). Migration 0103 is idempotent (CREATE TABLE/INDEX IF NOT EXISTS) for safe re-runs. - `LlmGenerationTracingModel` with `record` / `updateFeedback` / `findById` / `listRecent`, all userId-scoped to prevent cross-user leaks. - New package `@lobechat/llm-generation-tracing` mirroring agent-tracing's shape: `ITracingStore` interface, `FileTracingStore` (local/dev, scenario subfolders + latest.json symlink), `computePromptHash` (6-char sha256 of systemPrompt + schema), and `TRACING_SCENARIO_REGISTRY` + `resolveScenario` with explicit scenario override. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * ✨ feat(model-runtime): wire llm_generation_tracing into ModelRuntime.generateObject (LOBE-9462) Per-call interception layer — one hook covers all generateObject callers. - New `onGenerateObjectComplete` hook on `ModelRuntimeHooks`: always fires (success or failure) with latency, usage, output/error. Fixes the gap where `onGenerateObjectFinal` only fires when the runtime invokes `onUsage`. - `S3TracingStore` (zstd level 3, key `llm-generation-tracing/{scenario}/{v}-{hash}/{date}/{id}.json.zst`) and `LLMGenerationTracingService` that does DB insert → store.save → patch storage_key. Store failures preserve the row with `metadata.store_error`. - `createLLMGenerationTracingHook` + `mergeModelRuntimeHooks` wired into `initModelRuntimeFromDB`; tracing runs alongside business (billing) hooks via `next/server.after()` when available, microtask fallback otherwise. Unknown metadata keys (e.g. `parent_memory_trace_key`) pass through. - Memory extractor accepts `parentMemoryTraceKey` option for the job-level backlink. Follow-up-action caller given an explicit `scenario: 'follow_up'` metadata override — it was the only OSS caller missing trigger metadata. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * ✅ test(llm-generation-tracing): type vi.fn mocks so tsgo accepts mock.calls indexing The hook + service tests destructured `mock.calls[0][0]` and accessed nested fields, which tsgo flagged as TS2493 / TS18046 because `vi.fn()` defaults to a zero-arg signature. Add explicit type parameters to the mocks so tsgo can infer the call tuple, and cast `call.payload` at the access point. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * ♻️ refactor(model-runtime): move mergeModelRuntimeHooks into the package It's a generic utility for composing `ModelRuntimeHooks` instances — same import surface as `ModelRuntime` and the hooks interface — so it belongs alongside them rather than tucked under a server-side consumer. - New `packages/model-runtime/src/core/mergeHooks.ts` exports `mergeModelRuntimeHooks` and is re-exported from the package index. - Move the unit tests to `packages/model-runtime/src/core/mergeHooks.test.ts`, including a new case covering the "a throws → b is skipped" load-bearing semantics. - `src/server/services/llmGenerationTracing/hook.ts` drops the local copy and the consumer (`src/server/modules/ModelRuntime/index.ts`) imports from `@lobechat/model-runtime`. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * ♻️ refactor(llm-generation-tracing): version lives with the prompt, not in a central table `promptVersion` was baked into `TRACING_SCENARIO_REGISTRY`, far from any prompt definition — editing a prompt + forgetting to bump the entry in a completely different file was an obvious foot-gun. - Registry is now `Record<string, string>` mapping trigger → scenario only; it's the stable concern that rarely changes. - `resolveScenario` always passes `promptVersion` through from the caller, defaulting to `UNKNOWN_PROMPT_VERSION` ('v0') when absent. - Each call site declares its own `*_PROMPT_VERSION` constant next to the prompt it describes. `followUpAction` ships the first one: `FOLLOW_UP_PROMPT_VERSION` in `prompts/index.ts`, threaded through `metadata.promptVersion` at the `generateObject` call. Other callers can add the same constant when they next touch their prompts. The 6-char prompt hash on the row still catches forgotten bumps. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * ✨ feat(input-completion): wire prompt-version metadata at the auto-complete call site Aligns input auto-complete with the FOLLOW_UP_PROMPT_VERSION convention so each prompt iteration is recordable as the chat-side tracing lands. - `INPUT_COMPLETION_PROMPT_VERSION = 'v1.0'` declared next to `chainInputCompletion` — bump together with the prompt body. - `fetchPresetTaskResult` accepts optional `metadata` and forwards it to `getChatCompletion`; the existing chat path already plumbs metadata to `ModelRuntime.chat` options. - `InputEditor` call site passes `{ scenario: 'input_completion', promptVersion }`. Note: `llm_generation_tracing` currently only fires from `onGenerateObjectComplete`. Input completion is a `chat` call, so this metadata is forward-looking until a chat-side tracing hook lands. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * 🐛 fix(llm-generation-tracing): collapse bucketDir path.join args to silence turbopack glob warning Turbopack's static analyzer treats `path.join(root, dyn1, dyn2)` as a multi-segment glob pattern and warned that it could match ~12k files in the project. Compose the relative subdir as a single string first, so `path.join` only sees one dynamic segment. Behavior unchanged — the resulting path is identical. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * ✨ feat(input-completion): route auto-complete through generateObject for tracing Auto-complete is the first preset-task caller migrated to the structured- output path so it lands in `llm_generation_tracing` via the existing `onGenerateObjectComplete` hook. No new server hook, no global chat-side tracing. - `chainInputCompletion` now returns `{ messages, schema }` with a minimal `{ completion: string }` schema and a stable `INPUT_COMPLETION_SCHEMA_NAME` constant. JSON wrapping costs ~15-30 tokens against a 100-token completion budget — negligible for the observability win. - `StructureOutputSchema` / `StructureOutputParams` accept optional `metadata`; `aiChatRouter.outputJSON` merges caller metadata over the default trigger so `{ scenario, promptVersion, schemaName }` reach `ModelRuntime.generateObject` options unchanged. - `IStructureSchema.description` is now optional to match the zod schema — previously the TS type was stricter than runtime validation accepted. - `InputEditor` switches from `chatService.fetchPresetTaskResult` to `aiChatService.generateJSON`, reading `response.completion`. Streaming is dropped because auto-complete already buffers the full result before inserting; no UX change. - Reverts the unused `metadata` field that was added to `fetchPresetTaskResult` in the previous commit — no current caller needs it now that input completion uses the generateObject path. Bumps `INPUT_COMPLETION_PROMPT_VERSION` to v2.0 because the system prompt gained an "output the completion field" instruction. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * ♻️ refactor(aiGeneration): extract the runtime-init + generateObject dance into a service Every server-side caller that produces structured output was repeating the same two-step ritual: `initModelRuntimeFromDB(...)` → `runtime.generateObject(payload, { metadata })`. `AiGenerationService` collapses it into one call so future cross-cutting concerns (default metadata, retry, observability hooks) have one place to land. - New `src/server/services/aiGeneration/index.ts` exposes `generateObject<T>(input, options)` and is unit-tested for provider resolution + payload/metadata pass-through. - `aiChatRouter.outputJSON` and `FollowUpActionService.extract` migrated to the service (other callers move organically when next touched). - Drops the unused `keyVaultsPayload` field from `StructureOutputParams` and the placeholder at the InputEditor call site — key vaults are server-resolved from DB, the client never supplies them. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * ♻️ refactor(tracing): centralize TRACING_SCENARIOS const + inject AiGenerationService via trpc ctx - New `packages/const/src/llmGenerationTracing.ts` exports `TRACING_SCENARIOS` + `TracingScenario` type — the single directory where every known scenario name lives. Adds `@lobechat/const` as a workspace dep on llm-generation- tracing so `TRACING_SCENARIO_REGISTRY` can reference the same literals. - Callers (FollowUpActionService, InputEditor) replace `'follow_up'` / `'input_completion'` string literals with `TRACING_SCENARIOS.FollowUp` / `.InputCompletion`, so a typo or a rename fails the type-check instead of silently drifting on the row. - `AiGenerationService` is now injected into the `aiChatProcedure` ctx middleware alongside `aiChatService`; `outputJSON` consumes it via `ctx.aiGenerationService` instead of new-ing it inside the handler. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * ✨ feat(llm-generation-tracing): add lt/llm-tracing CLI + drop local-only storage_key - Add `lt` / `llm-tracing` CLI under @lobechat/llm-generation-tracing with `list` (recent records, --scenario filter, --json) and `inspect` (by tracing_id prefix or latest, --full, --json). - `FileTracingStore.save` now returns `{ key: null }` so dev DB rows leave `storage_key` empty instead of recording a non-resolvable local path; S3 store remains the source of truth for the real key. Add helpers `findByTracingId` / `getLatest` used by the CLI. - Wire `agentId` and `topicId` into `input_completion` tracing metadata from the chat input auto-complete call site. - Default `FileTracingStore` whenever NODE_ENV=development (drop the ENABLE_LLM_GENERATION_TRACING_LOCAL opt-in env var). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * 💄 style(llm-generation-tracing): prettier CLI output (tree + colors) Mirror the @lobechat/agent-tracing viewer style: - Inline ANSI color helpers (dim/bold/cyan/magenta/green/yellow/red). - Compact single-line header with id, scenario, version, model, status, time — replaces the multi-line bullet list. - Tree structure with `├─`/`└─` connectors instead of `── section ──` banners. - input arrays render per-message (role + char count + preview) rather than dumping raw JSON. - Small single-key outputs (e.g. `{ completion: "怎么样" }`) collapse to inline `key: "value"`. - `lt list` switches to a colored, properly padded table. Default view stays compact; --full expands system_prompt / input / schema bodies. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * ♻️ refactor(llm-generation-tracing): split `tracing` config out of `metadata` `options.metadata` was overloaded — half tracing-specific structured fields (scenario / promptVersion / schemaName / agentId / topicId / ...), half free-form jsonb passthrough. Callers couldn't tell which was which, and the inputHint was always auto-extracted (useless when the prompt wraps the user's text in a template). This commit introduces a dedicated `tracing` option: - Add `TracingOptions` to @lobechat/llm-generation-tracing — the typed shape callers import (agentId / topicId / inputHint / scenario / promptVersion / schemaName / systemPrompt / parentTracingId / metadata). - Add loose `tracing?: Record<string, unknown>` to GenerateObjectOptions and StructureOutputParams / StructureOutputSchema so the field flows through the runtime + TRPC. - Tracing hook now reads `context.options.tracing` for structured fields; it still falls back to `metadata.trigger` for the cross-cutting trigger string (ModelRuntime itself uses metadata.trigger for timing logs, so trigger stays on metadata). - Service `record()` accepts an explicit `inputHint`; otherwise falls back to auto-extraction from the first user message. Always truncated. - Free-form jsonb fields move to `tracing.metadata` (was unknown-key passthrough on `metadata`). - Call sites updated: - FollowUpAction now passes `tracing: { scenario, promptVersion, schemaName, topicId }` (previously `metadata`). - InputCompletion now passes `tracing: { agentId, topicId, inputHint: input, scenario, promptVersion, schemaName }` — `inputHint` is the user's actual typed text, not the wrapper prompt's first user message. - `aiChat.outputJSON` router forwards both metadata and tracing. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Update inputCompletion.ts * 🐛 fix(llm-generation-tracing): stop duplicating provider into the row's metadata jsonb `provider` is already a first-class column on the `llm_generation_tracing` row, so auto-stamping it into the `metadata` jsonb column on every call was pure noise. The hook now writes the caller-supplied `tracing.metadata` verbatim — empty/undefined when the caller had nothing to add. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

@localfile

# 🚀 LobeHub Release (20260528) **Release Date:** May 28, 2026 **Since v2.2.0:** 220 merged PRs · 15 contributors > This cycle brings heterogeneous "platform agents" you can dispatch to local or remote devices, a rebuilt onboarding flow, document-centric chat, and a unified model-runtime error model — with new DeepSeek V4 and Gemini 3.5 Flash support along the way. --- ## ✨ Highlights - **More Hetero Agents (OpenClaw / Hermes)** — Create heterogeneous agents and dispatch them to local or remote devices through the device gateway, with an execution-target switcher in the composer and persistent CLI sessions. (#15065, #15179, #15022) - **iMessage on Desktop** — New iMessage setup and bridge on desktop, plus bot attachments across every platform. (#15228, #15227, #15029) - **Skills in the Composer** — Drag skill chips into chat, trigger installed skills from the slash menu mid-line, and surface project-level skills in the homogeneous agent runtime. (#15095, #15061, #15110) - **New Models** — DeepSeek V4 Flash/Pro and Gemini 3.5 Flash across providers, with thinking params for structured output and chat cost estimates. (#15031, #15001, #15051, #14876) - **Agent Runtime Observability** — OpenTelemetry GenAI semantic conventions plus per-call generation tracing. (#15123, #15124) --- ## 🤖 Agents & Heterogeneous Runtime - **Platform agent creation** — OpenClaw/Hermes creation UI, device guard, and remote dispatch backend. (#15065) - **Execution-target switcher** — Pick local vs remote execution directly in the composer; device-selection UX with actionable guidance. (#15179, #15111) - **CLI hetero dispatch** — OpenClaw/Hermes dispatch with persistent sessions and a notify protocol. (#15022) - **Gateway snapshot as source of truth** — Consume the gateway `uiMessages` snapshot at step boundaries to keep chat state consistent. (#15153, #15152) - **Client sub-agent as a normal tool call** — Simplifies the sub-agent execution path. (#15281) - **Hermes agent chain** — Implements the Hermes agent chain logic. (#15189) - **Device registry** — TRPC endpoints to register, list, update, and remove devices. (#15299) - **Desktop device routing** — Route gateway agent runs through `lh hetero exec`; restore `userId` in gateway dispatch and gate local-system by execution target. (#15132, #15232) - **Agent signals** — Anchor agent-signal receipts to messages and isolate memory-agent messages into a child thread. (#14969, #14921) --- ## 🚀 Onboarding - **Simplified first screen** — Defer topic creation to first send. (#15090) - **Market Agent Picker** — Added as a classic onboarding step, with template prefetch. (#14980, #15041) - **Welcome guidance** — Show agent welcome guidance on first run. (#15098) - **Mobile** — Adapt agent onboarding UI and restore Classic-step padding on mobile. (#15019, #15032) - **Discovery** — Streamline discovery to a single profession question. (#14987) - **Analytics** — Track onboarding step events and create-agent modal source. (#15133, #15028) --- ## 📄 Documents, Pages & Knowledge - **Thread chat in preview** — Embed thread chat in the document preview portal. (#15216) - **Non-markdown rendering** — Render non-markdown docs as a read-only highlight. (#15272) - **Multi-select** — Multi-select delete in the document tree. (#15125) - **Page-agent streaming** — Preview `initPage` streaming arguments. (#15039) - **Per-agent topics** — Per-agent topic management page. (#15207) - **Server-side category** — Derive document category server-side and drop frontend predicates. (#15076) --- ## 🧩 Skills & Tools - **Drag skill chips** — Drag skills into chat input and register agent-document skills. (#15095) - **Slash menu** — Installed skills appear in the slash menu with a mid-line trigger. (#15061) - **Project skills** — Recognize project-level skills in the homogeneous agent runtime and surface them regardless of active device. (#15110, #15177) - **VFS archiving** — Archive oversized tool results to VFS instead of truncating. (#15074) - **@localfile mentions** — Drag folders into chat input as `@localFile` mentions on desktop. (#15071) --- ## 🧠 Model Runtime & Providers - **Error spec registry** — Unify error codes into a spec + pattern registry, split `ProviderBizError` into finer codes, classify Cloud-only codes via a tier digit, and add `DatabasePersistError`. (#15262, #15286, #15278, #15279) - **New models** — DeepSeek V4 Flash/Pro (opencode-go) and Gemini 3.5 Flash; DeepSeek V4 Pro on SiliconCloud. (#15031, #15001, #15017, #15267) - **Structured output** — Thinking params for structured output, Bedrock structured generation, and DeepSeek `generateObject` tool choice. (#15051, #15174, #15054) - **Cost** — Chat cost estimate support; preserve usage cost in custom streams. (#14876, #15218) --- ## 💬 Chat & User Experience - **Follow-up chips** — Extend follow-up chip suggestions to general chat with scene-specific model config. (#15101, #14797) - **Input drafts** — Persist unsent input drafts across tab switches and prevent repeated draft restore. (#14992, #15024) - **Command menu** — Order topic/message search by recency and promote inline type filters. (#15094, #14986) - **Zoom HUD** — Show a zoom-level HUD on Cmd +/− and Cmd 0. (#15294) - **Copy** — Unescape markdown escapes when copying user messages. (#15253) --- ## 🖥️ Desktop - **App Nap fix** — Prevent App Nap from dropping the gateway WebSocket during display sleep. (#14994) - **File preview** — Preview `.cjs`/`.mjs`/no-extension files instead of binary fallback and expand `~` when opening local files. (#15168, #15284) - **Cross-platform settings** — Open settings via main-window navigation on Windows/Linux and restore the route after an update restart. (#15036, #14922) - **Token refresh** — Prevent frequent logout from token-refresh retries. (#14928) --- ## 📊 Observability - **OTel GenAI** — Instrument Agent Runtime with OpenTelemetry GenAI semantic conventions. (#15123) - **Generation tracing** — Per-call `llm_generation_tracing` with a pre-allocated tracingId and recordFeedback router. (#15124, #15146) - **Error classification** — Persist `ERROR_CODE_SPECS` classification on operation errors. (#15273) --- ## 🗃️ Database Migrations - **Batch migrations** — Topic usage stats, push tokens, `tasks.editor_data`, and document shares. (#15280) - **Tracing & eval tables** — Add `llm_generation_tracing` and agent eval experiment tables. (#15126) > Self-hosted operators should run the database migration (`pnpm db:migrate`, or restart with auto-migrate enabled) after upgrading. The changes are additive and backwards-compatible. --- ## 🔒 Security & Reliability - **Security:** Remove the `getPlaintextCred` tool to prevent plaintext credential exposure. (#14998) - **Security:** Prompt account selection for Google OAuth and add `prompt=consent` to the OIDC authorization URL to fix missing refresh tokens. (#15234, #15010) - **Reliability:** Preserve streamed content across a mid-stream cancel. (#15173) - **Reliability:** Bound the Redis command timeout and configure the Anthropic client timeout. (#15091, #15042) - **Reliability:** Prevent infinite recursion in the assistant chain. (#15288) --- ## 👥 Contributors Huge thanks to **15 contributors** who shipped **220 merged PRs** this cycle. @AnotiaWang · @sxjeru · @algojogacor · @hardy-one · @arvinxx · @Innei · @tjx666 · @lijian · @AmAzing129 · @rdmclin2 · @neko · @cy948 · @CanisMinor · @sudongyuer · @rivertwilight Plus @lobehubbot and renovate[bot] for maintenance. --- **Full Changelog**: v2.2.0...release/weekly-20260528

arvinxx requested review from nekomeowww and tjx666 as code owners May 22, 2026 15:15

dosubot Bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label May 22, 2026

sourcery-ai Bot reviewed May 22, 2026

View reviewed changes

vercel Bot deployed to Preview May 22, 2026 15:39 View deployment

arvinxx force-pushed the feat/lobe-9462-llm-generation-tracing branch from 9e50f12 to 21549cf Compare May 22, 2026 16:09

dosubot Bot added size:XXL This PR changes 1000+ lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels May 22, 2026

vercel Bot deployed to Preview May 22, 2026 16:30 View deployment

arvinxx changed the title ~~✨ feat: per-call llm_generation_tracing observability (LOBE-9462)~~ ✨ feat: per-call llm_generation_tracing observability May 22, 2026

dosubot Bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:XXL This PR changes 1000+ lines, ignoring generated files. labels May 22, 2026

arvinxx force-pushed the feat/lobe-9462-llm-generation-tracing branch from d09fbbe to ff51f08 Compare May 22, 2026 16:55

vercel Bot deployed to Preview May 22, 2026 17:01 View deployment

arvinxx and others added 6 commits May 23, 2026 01:06

arvinxx force-pushed the feat/lobe-9462-llm-generation-tracing branch from ff51f08 to d9bf1a4 Compare May 22, 2026 17:06

vercel Bot deployed to Preview May 22, 2026 17:37 View deployment

arvinxx and others added 3 commits May 23, 2026 15:45

vercel Bot deployed to Preview May 23, 2026 08:10 View deployment

dosubot Bot added size:XXL This PR changes 1000+ lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels May 23, 2026

vercel Bot deployed to Preview May 23, 2026 09:32 View deployment

dosubot Bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:XXL This PR changes 1000+ lines, ignoring generated files. labels May 23, 2026

arvinxx and others added 2 commits May 23, 2026 17:50

Update inputCompletion.ts

449dc56

vercel Bot deployed to Preview May 23, 2026 10:07 View deployment

arvinxx merged commit cce1491 into canary May 23, 2026
35 checks passed

arvinxx deleted the feat/lobe-9462-llm-generation-tracing branch May 23, 2026 10:14

arvinxx mentioned this pull request May 23, 2026

✨ feat(llm-generation-tracing): pre-allocate tracingId + recordFeedback router #15146

Merged

3 tasks

arvinxx mentioned this pull request May 28, 2026

🚀 release: 20260528 #15302

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

✨ feat: per-call llm_generation_tracing observability#15124

✨ feat: per-call llm_generation_tracing observability#15124
arvinxx merged 15 commits into
canaryfrom
feat/lobe-9462-llm-generation-tracing

arvinxx commented May 22, 2026

Uh oh!

vercel Bot commented May 22, 2026 •

edited

Loading

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

codecov Bot commented May 22, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

arvinxx commented May 22, 2026

Summary

Gating

Follow-ups (intentionally NOT in this PR)

Test plan

Uh oh!

vercel Bot commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented May 22, 2026 •

edited

Loading

codecov Bot commented May 22, 2026 •

edited

Loading