Skip to content

feat: add LLM trace diagnostics to session exports#457

Merged
Astro-Han merged 4 commits into
devfrom
codex/i454-llm-trace-export
May 5, 2026
Merged

feat: add LLM trace diagnostics to session exports#457
Astro-Han merged 4 commits into
devfrom
codex/i454-llm-trace-export

Conversation

@Astro-Han

@Astro-Han Astro-Han commented May 5, 2026

Copy link
Copy Markdown
Owner

Summary

Adds lightweight LLM trace summaries for assistant model runs and includes them in local session exports.

Why

Session exports currently preserve final messages, parts, and token metadata, but they do not show whether an output anomaly happened at the request boundary, AI SDK normalized stream, PawWork processor, or persisted message. This PR adds count-only diagnostics so issue #454 can be diagnosed from the export without raw prompts, raw model output, headers, API keys, or provider chunks.

Related Issue

Closes #454
Closes #214

Human Review Status

Pending. A human should make the final merge decision after reviewing the final diff and verification evidence.

Review Focus

Please focus on the trace contract and privacy boundary: request summary allowlist, normalized stream event counts, final stored part counts, and whether assistant-message retention is the right persistence layer.

Risk Notes

Additive assistant message diagnostics widen the v2 SDK schema and persist one count-only trace per assistant model run. The trace intentionally omits prompts, messages, headers, API keys, raw provider chunks, raw output text, and tool bodies. Generated SDK files changed after bun --cwd packages/sdk/js build. No desktop, packaging, updater, signing, path, shell, or permission behavior is changed.

How To Verify

LLM trace/export tests: 35 passed, 0 failed
Processor integration tests: 15 passed, 0 failed
opencode typecheck: passed
SDK build: passed, v2 SDK generated files updated
SDK typecheck: passed
Diff check: no whitespace errors

Commands run:

bun --cwd packages/opencode test test/session/llm-trace.test.ts test/session/export.test.ts --timeout 30000
bun --cwd packages/opencode test test/session/processor-effect.test.ts --timeout 30000
bun --cwd packages/opencode typecheck
bun --cwd packages/sdk/js build
bun --cwd packages/sdk/js typecheck
git diff --check

Screenshots or Recordings

Not applicable. No visible UI changes.

Checklist

  • Human review status is stated above as pending, approved, or not required
  • I linked the related issue, or stated why there is no issue
  • This PR has type, scope, and priority labels, or I requested maintainer labeling
  • I described the review focus and any meaningful risks
  • I listed the relevant verification steps and the key result for each
  • I did not introduce unrelated refactors, dependencies, generated files, or file changes beyond the stated scope
  • I manually checked visible UI or copy changes when needed, with screenshots or recordings
  • I considered macOS and Windows impact for desktop, packaging, updater, signing, paths, shell, or permissions changes
  • I called out docs, release notes, dependencies, permissions, credentials, deletion behavior, generated content, or local file changes when relevant
  • I reviewed the final diff for unrelated changes and suspicious dependency changes
  • I am targeting dev, and my PR title and commit messages use Conventional Commits in English

Requested maintainer labeling for type, scope, and priority.

@coderabbitai

coderabbitai Bot commented May 5, 2026

Copy link
Copy Markdown
Contributor

Warning

Rate limit exceeded

@Astro-Han has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 29 minutes and 5 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 05ae57ee-b773-4a06-8bba-10abf008c6dd

📥 Commits

Reviewing files that changed from the base of the PR and between b1656c3 and 8d2bbb2.

⛔ Files ignored due to path filters (2)
  • packages/sdk/js/src/v2/gen/sdk.gen.ts is excluded by !**/gen/**
  • packages/sdk/js/src/v2/gen/types.gen.ts is excluded by !**/gen/**
📒 Files selected for processing (11)
  • packages/opencode/src/session/export.ts
  • packages/opencode/src/session/llm-trace/index.ts
  • packages/opencode/src/session/llm-trace/recorder.ts
  • packages/opencode/src/session/llm-trace/types.ts
  • packages/opencode/src/session/llm.ts
  • packages/opencode/src/session/message-v2.ts
  • packages/opencode/src/session/processor.ts
  • packages/opencode/test/session/export.test.ts
  • packages/opencode/test/session/llm-trace.test.ts
  • packages/opencode/test/session/llm.test.ts
  • packages/opencode/test/session/processor-effect.test.ts
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/i454-llm-trace-export

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Astro-Han Astro-Han added enhancement New feature or request harness Model harness, prompts, tool descriptions, and session mechanics P2 Medium priority labels May 5, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements a comprehensive LLM tracing system that records request metadata, stream events, and token usage, integrating these traces into session processing and exports. The SDK is also updated with new turn-change endpoints and enhanced question-handling capabilities. Feedback was provided to refine the isEmptyCompletion check in the recorder to include all stored part types for better accuracy.

Comment thread packages/opencode/src/session/llm-trace/recorder.ts
@Astro-Han Astro-Han merged commit 36e2f24 into dev May 5, 2026
20 checks passed
@Astro-Han Astro-Han deleted the codex/i454-llm-trace-export branch May 5, 2026 15:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request harness Model harness, prompts, tool descriptions, and session mechanics P2 Medium priority

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Task] Add LLM trace diagnostics to session exports [Feature] Add LLM stream diagnostics to local session export

1 participant