Skip to content

chore: add CodeRabbit config#2

Merged
mabry1985 merged 9 commits into
mainfrom
feat/langfuse-tracing
Apr 2, 2026
Merged

chore: add CodeRabbit config#2
mabry1985 merged 9 commits into
mainfrom
feat/langfuse-tracing

Conversation

@mabry1985

@mabry1985 mabry1985 commented Apr 2, 2026

Copy link
Copy Markdown

Summary

  • .coderabbit.yaml: chill review profile, auto-review enabled on PRs targeting main
  • Branch protection on main already applied via API (requires Lint + CodeQL before merge)

Install the CodeRabbit GitHub app at https://github.com/apps/coderabbit-ai if not already done.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Documentation

    • Updated environment configuration guidance for local development setups.
  • Improvements

    • Enhanced telemetry logging with comprehensive token usage tracking and operation timing metrics for AI interactions.
    • Improved API provider compatibility by filtering unnecessary tool-related messages.
  • Chores

    • Added code review automation configuration for workflow efficiency.
    • Updated build artifact exclusions and cleaned up linting directives.

GitHub CI and others added 3 commits April 1, 2026 22:57
Add full OTel span instrumentation so setting LANGFUSE_PUBLIC_KEY +
LANGFUSE_SECRET_KEY gives end-to-end trace visibility in Langfuse.

- LLM spans: all 3 providers (OpenAI-compat, Anthropic, Gemini) emit
  gen_ai spans with token usage attrs (input/output/total) for cost tracking
- Turn hierarchy: per-turn root span in client.ts with context propagation
  so LLM/tool/agent spans nest correctly in Langfuse trace view
- Tool spans: coreToolScheduler wraps every tool execution in a child span
  with name, type, decision, duration, and error attributes
- Agent spans: agent.ts wraps subagent execution (foreground + background)
  with full lifecycle coverage
- Content logging: prompt/response span events gated by telemetryLogPrompts
- Tests: 26 new tests covering Langfuse activation and turn span lifecycle
- Docs: README Observability section + AGENTS.md config note

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- .coderabbit.yaml: chill profile, auto-review on PRs targeting main
- main branch now requires Lint + CodeQL to pass before merge

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Apr 2, 2026

Copy link
Copy Markdown

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Configuration and documentation updates are added: CodeRabbit YAML for review automation, .gitignore rule for benchmark outputs, and Langfuse environment file guidance in README. The OpenAI pipeline is enhanced with streaming telemetry collection, improved span lifecycle management, and request validation. ESLint suppressions are removed.

Changes

Cohort / File(s) Summary
Configuration & Build Files
.coderabbit.yaml, .gitignore
New CodeRabbit configuration with review automation and profile settings; added jobs/ directory to gitignore for benchmark outputs.
Documentation
README.md
Added guidance on using ~/.proto/.env for Langfuse configuration instead of project-level .env files.
Telemetry & Streaming Pipeline
packages/core/src/core/openaiContentGenerator/pipeline.ts
Enhanced streaming pipeline with completion text accumulation, comprehensive OpenTelemetry span instrumentation (duration, token usage, model, completion events), improved span lifecycle management (deferred ending for streaming, immediate for non-streaming), request validation to filter tool messages when tools unavailable, and updated error handling with explicit span status and termination.
Code Cleanup
packages/core/src/utils/schemaValidator.ts, packages/webui/src/index.ts
Removed ESLint eslint-disable-next-line import/no-internal-modules suppression comments.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Description check ⚠️ Warning The description is incomplete relative to the template; it lacks TLDR details about all changes, Screenshots/Demo section, Dive Deeper discussion, Reviewer Test Plan, and Testing Matrix. Only Summary and app installation note are provided. Expand the description to include a comprehensive TLDR covering all file changes (README, pipeline.ts, schemaValidator.ts, index.ts, .gitignore), provide 'N/A — no user-facing change' for Screenshots, add Dive Deeper context, Reviewer Test Plan, and complete the Testing Matrix.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'chore: add CodeRabbit config' clearly and concisely describes the main change—adding a CodeRabbit configuration file. It accurately reflects the primary modification in the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/langfuse-tracing

Comment @coderabbitai help to get the list of available commands and usage tips.

GitHub CI and others added 2 commits April 1, 2026 23:07
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🧹 Nitpick comments (2)
.gitignore (1)

2-2: Anchor the ignore rule to repo root to avoid accidental overmatching.

At Line 2, jobs/ ignores any directory named jobs anywhere in the tree. If this is meant only for root benchmark output, prefer a root-anchored rule.

Proposed tweak
- jobs/
+ /jobs/
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.gitignore at line 2, The .gitignore currently contains the pattern "jobs/"
which matches any directory named jobs anywhere in the repo; change this to a
root-anchored rule by replacing the pattern with "/jobs/" so it only ignores the
jobs directory at the repository root (update the existing "jobs/" entry to
"/jobs/").
packages/core/src/core/openaiContentGenerator/pipeline.ts (1)

131-132: Cap the streamed completion buffer at the truncation limit.

Line 291 only emits 10k characters, but completionParts keeps the full streamed response in memory until the end. For long outputs that creates an unnecessary second copy of the completion just for telemetry logging. Stop appending once the preview buffer reaches the event limit.

Also applies to: 166-169, 289-297

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/openaiContentGenerator/pipeline.ts` around lines 131 -
132, The streamed completion buffer completionParts currently accumulates the
entire response causing a second full copy in memory; limit its growth by
capping appended text to the telemetry preview/truncation limit (e.g., 10_000
chars): before pushing/concatenating each incoming chunk in the pipeline handler
(and the similar places referenced where completionParts is appended), check the
current accumulated length and only append up to (limit - currentLength)
characters from the chunk (or skip appending if already at/above the limit);
apply the same guard to all occurrences that build completionParts so the
preview buffer never exceeds the event truncation limit.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/core/src/core/client.ts`:
- Around line 964-972: The recursion in sendMessageStream forwards the original
options and causes a continuation to start a new root turn span (startTurnSpan)
even when an active turn for the same prompt_id already exists (ownsTurnSpan ===
true); update the continuation path so it reuses the existing active turn span
instead of starting a new one by either (A) guarding startTurnSpan with the
active-turn helper (check for an active turn/span for prompt_id before calling
startTurnSpan and only set ownsTurnSpan when you actually created the span), or
(B) mark the recursive continuation as a non-owning internal continuation (pass
a flag in options to indicate non-owning continuation so startTurnSpan is
skipped), and ensure ownsTurnSpan/endTurnSpan logic only runs when the span was
created by this invocation.
- Around line 477-485: Start the turn span with
startTurnSpan(this.config.getSessionId(), prompt_id) immediately before entering
the generator logic and ensure endTurnSpan() is called inside a finally block so
the turn span is always closed regardless of errors or early exits; update the
code around startTurnSpan/endTurnSpan in the generator to wrap the main body in
try { ... } finally { endTurnSpan() }. Also modify the recursive continuation
call that currently reuses the original options (the call near the recursion
around the symbol options/prompt continuation) to pass an explicit non-root type
such as { type: SendMessageType.Hook } (or another non-root SendMessageType)
instead of the original options so recursive continuations reuse the active span
instead of attempting to start a second root span for the same prompt_id.

In `@packages/core/src/core/coreToolScheduler.ts`:
- Around line 1477-1481: Record the tool duration on all non-success exit paths
by setting the same 'tool.duration_ms' attribute on the toolSpan before calling
setStatus or setAttribute for errors/cancellations; locate the span handling
code around toolSpan (where you currently call toolSpan.setStatus and
toolSpan.setAttribute('error.message', ...)) and compute the elapsed ms the same
way as the success branch, then call toolSpan.setAttribute('tool.duration_ms',
durationMs) for the cancelled, hook-blocked, and error branches (also apply
identical additions at the other occurrences around the file noted in the
review: the blocks near lines handling similar toolSpan exits at the other five
locations).

In `@packages/core/src/core/geminiContentGenerator/geminiContentGenerator.ts`:
- Around line 179-188: The code currently emits raw prompt/completion content
when logPrompts is truthy (see logPrompts, span.addEvent and
finalRequest.contents); update the implementation to require an explicit opt-in
flag (e.g., a clearly named allowTelemetryRawPrompts or
explicitTelemetryConsent) before serializing and adding span events, and ensure
the default for getTelemetryLogPromptsEnabled() is false so enabling Langfuse
does not automatically export sensitive payloads; apply the same change pattern
to the other locations referenced (lines ~196-208, ~248-257, ~303-312) so all
span.addEvent calls that include raw prompt/completion content are gated behind
the explicit opt-in flag.

In `@packages/core/src/core/openaiContentGenerator/pipeline.ts`:
- Around line 266-300: The OTel span attached to context.span in
processStreamWithLogging (used with lastResponse and completionParts) can remain
open if the consumer stops iterating early; move all span cleanup (setting final
attributes, adding events from completionParts, setting status) into a finally
block so context.span.end() is always executed regardless of normal completion,
error, or early generator termination, and ensure any existing
span.setAttributes / addEvent calls run before ending the span inside that
finally.

In `@packages/core/src/telemetry/turnSpanContext.ts`:
- Around line 21-23: The module-global mutable state activeTurnSpan and
activeTurnContext in turnSpanContext.ts causes cross-request races when
startTurnSpan() / endTurnSpan() are called concurrently; replace this
process-global singleton with per-execution storage (e.g. Node's
AsyncLocalStorage or explicit context objects threaded through calls) so each
turn gets its own Span and OTel Context; update startTurnSpan(), endTurnSpan(),
and any getters to read/write from the per-execution store (or accept/return a
context object) instead of using activeTurnSpan/activeTurnContext to prevent one
turn from stealing or ending another turn's span.

In `@packages/core/src/tools/agent.ts`:
- Around line 675-680: Background agents are currently passed the request-scoped
signal (the call to subagent.execute(contextState, signal)), which causes
background work to be cancelled when the originating request aborts; instead,
create a new, independent AbortController for background jobs (e.g., const
bgController = new AbortController()) and pass bgController.signal into
subagent.execute(contextState, bgController.signal), ensuring you do not forward
the request-scoped signal. Also wire bgController.abort() to whatever
agent-level shutdown lifecycle you already have (so background tasks can be
cancelled on explicit shutdown) and ensure you clean up the controller when the
subagent finishes; update any places constructing the background job (the
run_in_background / subagent.execute call) to use this independent controller.

In `@README.md`:
- Around line 164-174: Clarify that only LANGFUSE_PUBLIC_KEY and
LANGFUSE_SECRET_KEY are required (LANGFUSE_BASE_URL is optional) and update the
code fence to specify the language; change the prose from "Set three environment
variables" to "Set the following environment variables (BASE_URL is optional)"
and add the bash language tag to the fenced block so it becomes ```bash with the
three env vars LANGFUSE_PUBLIC_KEY, LANGFUSE_SECRET_KEY, and LANGFUSE_BASE_URL
shown (noting BASE_URL is optional).

---

Nitpick comments:
In @.gitignore:
- Line 2: The .gitignore currently contains the pattern "jobs/" which matches
any directory named jobs anywhere in the repo; change this to a root-anchored
rule by replacing the pattern with "/jobs/" so it only ignores the jobs
directory at the repository root (update the existing "jobs/" entry to
"/jobs/").

In `@packages/core/src/core/openaiContentGenerator/pipeline.ts`:
- Around line 131-132: The streamed completion buffer completionParts currently
accumulates the entire response causing a second full copy in memory; limit its
growth by capping appended text to the telemetry preview/truncation limit (e.g.,
10_000 chars): before pushing/concatenating each incoming chunk in the pipeline
handler (and the similar places referenced where completionParts is appended),
check the current accumulated length and only append up to (limit -
currentLength) characters from the chunk (or skip appending if already at/above
the limit); apply the same guard to all occurrences that build completionParts
so the preview buffer never exceeds the event truncation limit.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 72e77ed5-7c3a-4f4d-9bc5-a414bbfe5607

📥 Commits

Reviewing files that changed from the base of the PR and between 6a61a19 and c35c157.

📒 Files selected for processing (16)
  • .coderabbit.yaml
  • .gitignore
  • AGENTS.md
  • README.md
  • packages/core/src/core/anthropicContentGenerator/anthropicContentGenerator.ts
  • packages/core/src/core/client.ts
  • packages/core/src/core/coreToolScheduler.ts
  • packages/core/src/core/geminiContentGenerator/geminiContentGenerator.ts
  • packages/core/src/core/geminiContentGenerator/index.ts
  • packages/core/src/core/openaiContentGenerator/errorHandler.ts
  • packages/core/src/core/openaiContentGenerator/pipeline.ts
  • packages/core/src/telemetry/index.ts
  • packages/core/src/telemetry/sdk.test.ts
  • packages/core/src/telemetry/turnSpanContext.test.ts
  • packages/core/src/telemetry/turnSpanContext.ts
  • packages/core/src/tools/agent.ts

Comment on lines +477 to +485
// Start a turn root span for new user prompts. Recursive continuations
// (Hook, ToolResult, Retry) reuse the already-active span.
const ownsTurnSpan =
messageType === SendMessageType.UserQuery ||
messageType === SendMessageType.Cron;
if (ownsTurnSpan) {
startTurnSpan(this.config.getSessionId(), prompt_id);
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's check the file size and then read the relevant sections
wc -l packages/core/src/core/client.ts

Repository: protoLabsAI/protoCLI

Length of output: 101


🏁 Script executed:

# Read around the span opening (lines 470-495)
sed -n '470,495p' packages/core/src/core/client.ts | cat -n

Repository: protoLabsAI/protoCLI

Length of output: 1227


🏁 Script executed:

# Read the method context - need to find the method boundaries
sed -n '450,500p' packages/core/src/core/client.ts | cat -n

Repository: protoLabsAI/protoCLI

Length of output: 2055


🏁 Script executed:

# Search for endTurnSpan calls to see the cleanup pattern
rg "endTurnSpan" packages/core/src/core/client.ts -B 2 -A 1

Repository: protoLabsAI/protoCLI

Length of output: 2284


🏁 Script executed:

# Read the full method to understand the structure and exception handling
sed -n '437,600p' packages/core/src/core/client.ts | cat -n

Repository: protoLabsAI/protoCLI

Length of output: 6722


🏁 Script executed:

# Check for try/catch blocks around the span opening
sed -n '477,700p' packages/core/src/core/client.ts | cat -n

Repository: protoLabsAI/protoCLI

Length of output: 9728


🏁 Script executed:

# Look at lines 964-972 mentioned in the scratchpad for recursive continuation logic
sed -n '960,980p' packages/core/src/core/client.ts | cat -n

Repository: protoLabsAI/protoCLI

Length of output: 990


🏁 Script executed:

# Check for exception handlers in the entire method
rg "catch|finally|throw" packages/core/src/core/client.ts -n | grep -A 2 -B 2 "sendMessageStream"

Repository: protoLabsAI/protoCLI

Length of output: 46


🏁 Script executed:

# Read more of the method to see the complete structure
sed -n '437,1000p' packages/core/src/core/client.ts | tail -400 | cat -n

Repository: protoLabsAI/protoCLI

Length of output: 16972


🏁 Script executed:

# Check if there's a try/catch/finally wrapping the method body or span cleanup
rg "try\s*\{|catch\s*\(|finally\s*\{" packages/core/src/core/client.ts -n

Repository: protoLabsAI/protoCLI

Length of output: 247


🏁 Script executed:

# Look for the method ending to understand the full scope
sed -n '800,900p' packages/core/src/core/client.ts | cat -n

Repository: protoLabsAI/protoCLI

Length of output: 4681


🏁 Script executed:

# Let me verify the exact line numbers for the critical sections
sed -n '910,920p' packages/core/src/core/client.ts | cat -n

Repository: protoLabsAI/protoCLI

Length of output: 601


🏁 Script executed:

# Verify the recursive next speaker call to see if options are being reused
sed -n '1000,1010p' packages/core/src/core/client.ts | cat -n

Repository: protoLabsAI/protoCLI

Length of output: 561


🏁 Script executed:

# Double-check that the entire method lacks a top-level try/catch/finally
awk '/async \*sendMessageStream/,/^  }$/' packages/core/src/core/client.ts | head -20

Repository: protoLabsAI/protoCLI

Length of output: 804


🏁 Script executed:

# Count the structure - verify that no try/catch wraps the whole body
sed -n '437,1002p' packages/core/src/core/client.ts | grep -E "^\s{4}(try|catch|finally)\s*(\{|$)" | head -10

Repository: protoLabsAI/protoCLI

Length of output: 73


Wrap the turn span lifecycle in try/finally to guarantee cleanup regardless of how the generator exits.

The span opened at line 484 is only closed on explicit return branches. Any unhandled rejection from hooks, compression, arena control, checkNextSpeaker(), or callers that stop consuming the generator early will leave the active turn span open and leak telemetry context. Move startTurnSpan() before a try block and place endTurnSpan() in a finally block to ensure cleanup on all exit paths.

Additionally, the recursive continuation at line 364 passes the original options unchanged. If the initial call was UserQuery or Cron, the recursive call will incorrectly attempt to start a second root span for the same prompt_id, violating the design intent that "Recursive continuations reuse the already-active span." Change the recursive call to explicitly pass { type: SendMessageType.Hook } or another non-root-span type to match the pattern used elsewhere (e.g., lines 196, 241, 315).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/client.ts` around lines 477 - 485, Start the turn span
with startTurnSpan(this.config.getSessionId(), prompt_id) immediately before
entering the generator logic and ensure endTurnSpan() is called inside a finally
block so the turn span is always closed regardless of errors or early exits;
update the code around startTurnSpan/endTurnSpan in the generator to wrap the
main body in try { ... } finally { endTurnSpan() }. Also modify the recursive
continuation call that currently reuses the original options (the call near the
recursion around the symbol options/prompt continuation) to pass an explicit
non-root type such as { type: SendMessageType.Hook } (or another non-root
SendMessageType) instead of the original options so recursive continuations
reuse the active span instead of attempting to start a second root span for the
same prompt_id.

Comment on lines +964 to +972
const nextResult = yield* this.sendMessageStream(
nextRequest,
signal,
prompt_id,
options,
boundedTurns - 1,
);
if (ownsTurnSpan) endTurnSpan('ok');
return nextResult;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reuse the active turn span on model continuations.

Line 964 forwards the original options, so a continuation from a UserQuery/Cron re-enters this method with ownsTurnSpan === true and starts another root turn span for the same prompt_id. That splits one user turn across multiple roots. Guard startTurnSpan() with the active-turn helper, or switch this recursive path to an internal non-owning continuation type.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/client.ts` around lines 964 - 972, The recursion in
sendMessageStream forwards the original options and causes a continuation to
start a new root turn span (startTurnSpan) even when an active turn for the same
prompt_id already exists (ownsTurnSpan === true); update the continuation path
so it reuses the existing active turn span instead of starting a new one by
either (A) guarding startTurnSpan with the active-turn helper (check for an
active turn/span for prompt_id before calling startTurnSpan and only set
ownsTurnSpan when you actually created the span), or (B) mark the recursive
continuation as a non-owning internal continuation (pass a flag in options to
indicate non-owning continuation so startTurnSpan is skipped), and ensure
ownsTurnSpan/endTurnSpan logic only runs when the span was created by this
invocation.

Comment on lines +1477 to +1481
toolSpan.setStatus({ code: SpanStatusCode.ERROR });
toolSpan.setAttribute(
'error.message',
'User cancelled tool execution.',
);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Record tool.duration_ms on non-success paths too.

Only the success branch sets tool.duration_ms today. Cancelled, hook-blocked, and error spans end without any latency attribute, which makes the slowest failures invisible in traces.

Example fix
-        // OTel: record success attributes
-        const durationMs = Date.now() - execStartTime;
-        toolSpan.setAttribute('tool.duration_ms', durationMs);
+        // OTel: record success attributes
         if (contentLength !== undefined) {
           toolSpan.setAttribute('tool.content_length', contentLength);
         }
         toolSpan.setStatus({ code: SpanStatusCode.OK });
@@
     } finally {
+      toolSpan.setAttribute('tool.duration_ms', Date.now() - execStartTime);
       toolSpan.end();
     }

Also applies to: 1524-1525, 1541-1547, 1577-1579, 1613-1617, 1654-1655

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/coreToolScheduler.ts` around lines 1477 - 1481, Record
the tool duration on all non-success exit paths by setting the same
'tool.duration_ms' attribute on the toolSpan before calling setStatus or
setAttribute for errors/cancellations; locate the span handling code around
toolSpan (where you currently call toolSpan.setStatus and
toolSpan.setAttribute('error.message', ...)) and compute the elapsed ms the same
way as the success branch, then call toolSpan.setAttribute('tool.duration_ms',
durationMs) for the cancelled, hook-blocked, and error branches (also apply
identical additions at the other occurrences around the file noted in the
review: the blocks near lines handling similar toolSpan exits at the other five
locations).

Comment on lines +179 to +188
// Log prompt content as a span event when telemetry prompt logging is enabled
if (logPrompts) {
const promptJson = JSON.stringify(finalRequest.contents);
span.addEvent('gen_ai.content.prompt', {
'gen_ai.prompt':
promptJson.length > 10_000
? promptJson.slice(0, 10_000) + '...[truncated]'
: promptJson,
});
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Make prompt/completion telemetry explicit opt-in.

packages/core/src/config/config.ts, Lines 1772-1774 default getTelemetryLogPromptsEnabled() to true, so enabling Langfuse starts exporting raw prompts and completions by default. In this repo those payloads can contain source files, secrets, and user data, so this is a risky telemetry default. Please flip the default or require an explicit enable before emitting these span events. The Anthropic generator added the same path.

Also applies to: 196-208, 248-257, 303-312

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/geminiContentGenerator/geminiContentGenerator.ts`
around lines 179 - 188, The code currently emits raw prompt/completion content
when logPrompts is truthy (see logPrompts, span.addEvent and
finalRequest.contents); update the implementation to require an explicit opt-in
flag (e.g., a clearly named allowTelemetryRawPrompts or
explicitTelemetryConsent) before serializing and adding span events, and ensure
the default for getTelemetryLogPromptsEnabled() is false so enabling Langfuse
does not automatically export sensitive payloads; apply the same change pattern
to the other locations referenced (lines ~196-208, ~248-257, ~303-312) so all
span.addEvent calls that include raw prompt/completion content are gated behind
the explicit opt-in flag.

Comment on lines +266 to +300
// Set token usage attributes on the OTel span now that the stream
// has been fully consumed and usage data is available.
if (context.span) {
context.span.setAttributes({
'llm.duration_ms': context.duration,
});
const usage = lastResponse?.usageMetadata;
if (usage) {
const inputTokens = usage.promptTokenCount ?? 0;
const outputTokens = usage.candidatesTokenCount ?? 0;
context.span.setAttributes({
'gen_ai.usage.input_tokens': inputTokens,
'gen_ai.usage.output_tokens': outputTokens,
'gen_ai.usage.total_tokens':
usage.totalTokenCount ?? inputTokens + outputTokens,
});
}
if (lastResponse?.modelVersion) {
context.span.setAttribute(
'gen_ai.response.model',
lastResponse.modelVersion,
);
}
// Log streamed completion content as a span event
if (context.logPrompts && completionParts.length > 0) {
const responseText = completionParts.join('');
context.span.addEvent('gen_ai.content.completion', {
'gen_ai.completion':
responseText.length > 10_000
? responseText.slice(0, 10_000) + '...[truncated]'
: responseText,
});
}
context.span.end();
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

wc -l packages/core/src/core/openaiContentGenerator/pipeline.ts

Repository: protoLabsAI/protoCLI

Length of output: 125


🏁 Script executed:

# First, look at the specific lines mentioned (266-300)
sed -n '266,300p' packages/core/src/core/openaiContentGenerator/pipeline.ts | cat -n

Repository: protoLabsAI/protoCLI

Length of output: 1697


🏁 Script executed:

# Check lines 305-309 mentioned as "Also applies to"
sed -n '305,309p' packages/core/src/core/openaiContentGenerator/pipeline.ts | cat -n

Repository: protoLabsAI/protoCLI

Length of output: 293


🏁 Script executed:

# Check lines 633-644 mentioned as "Also applies to"
sed -n '633,644p' packages/core/src/core/openaiContentGenerator/pipeline.ts | cat -n

Repository: protoLabsAI/protoCLI

Length of output: 629


🏁 Script executed:

# Find the executeWithErrorHandling function
rg -n "executeWithErrorHandling" packages/core/src/core/openaiContentGenerator/pipeline.ts -A 20 | head -50

Repository: protoLabsAI/protoCLI

Length of output: 1520


🏁 Script executed:

# Find the processStreamWithLogging function
rg -n "processStreamWithLogging" packages/core/src/core/openaiContentGenerator/pipeline.ts -A 30 | head -60

Repository: protoLabsAI/protoCLI

Length of output: 2790


🏁 Script executed:

# Get the complete processStreamWithLogging function
sed -n '125,300p' packages/core/src/core/openaiContentGenerator/pipeline.ts | cat -n

Repository: protoLabsAI/protoCLI

Length of output: 8606


🏁 Script executed:

# Also check what comes after line 300 to see error handling and finally blocks
sed -n '290,320p' packages/core/src/core/openaiContentGenerator/pipeline.ts | cat -n

Repository: protoLabsAI/protoCLI

Length of output: 1376


🏁 Script executed:

# Get the full executeWithErrorHandling to see the complete flow
sed -n '530,680p' packages/core/src/core/openaiContentGenerator/pipeline.ts | cat -n

Repository: protoLabsAI/protoCLI

Length of output: 5970


Move span cleanup to a finally block to ensure closure on early generator termination.

The span is marked with status OK at line 633 before the async generator is consumed, then passed to processStreamWithLogging via context.span. While normal completion (line 266) and error paths (line 305) both call span.end(), early termination—when the caller stops iterating without exhausting the stream—bypasses both paths, leaving the span open with a stale success status. Add a finally block in processStreamWithLogging to guarantee the span closes regardless of how iteration ends.

Also applies to: 305–309, 633–644

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/openaiContentGenerator/pipeline.ts` around lines 266 -
300, The OTel span attached to context.span in processStreamWithLogging (used
with lastResponse and completionParts) can remain open if the consumer stops
iterating early; move all span cleanup (setting final attributes, adding events
from completionParts, setting status) into a finally block so context.span.end()
is always executed regardless of normal completion, error, or early generator
termination, and ensure any existing span.setAttributes / addEvent calls run
before ending the span inside that finally.

Comment on lines +21 to +23
// Module-level state: the active turn span and its OTel context
let activeTurnSpan: Span | undefined;
let activeTurnContext: Context | undefined;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't store the active turn span in a module-global singleton.

packages/core/src/core/client.ts, Lines 468-485 call startTurnSpan() for every user/cron turn. With activeTurnSpan and activeTurnContext as process-global state, two overlapping turns race: the later call steals the active context, and endTurnSpan() from either turn can close the other span. That cross-wires LLM, tool, and agent traces between requests. This needs per-execution storage (OTel context / AsyncLocalStorage) or explicit context threading instead of global mutable state.

Also applies to: 30-45

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/telemetry/turnSpanContext.ts` around lines 21 - 23, The
module-global mutable state activeTurnSpan and activeTurnContext in
turnSpanContext.ts causes cross-request races when startTurnSpan() /
endTurnSpan() are called concurrently; replace this process-global singleton
with per-execution storage (e.g. Node's AsyncLocalStorage or explicit context
objects threaded through calls) so each turn gets its own Span and OTel Context;
update startTurnSpan(), endTurnSpan(), and any getters to read/write from the
per-execution store (or accept/return a context object) instead of using
activeTurnSpan/activeTurnContext to prevent one turn from stealing or ending
another turn's span.

Comment on lines +675 to 680
// Fire and forget — execution runs independently.
// The span covers the full async lifetime: it ends when the
// background agent completes (or fails), not when execute() returns.
subagent
.execute(contextState, signal)
.then(() => {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Background agents still inherit the foreground abort signal.

This path is described as fire-and-forget work, but it still passes the request-scoped signal into subagent.execute(). If the originating turn is aborted after launch, the background agent is cancelled too, which breaks the independence contract for run_in_background. Give background jobs their own controller/shutdown signal instead.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/tools/agent.ts` around lines 675 - 680, Background agents
are currently passed the request-scoped signal (the call to
subagent.execute(contextState, signal)), which causes background work to be
cancelled when the originating request aborts; instead, create a new,
independent AbortController for background jobs (e.g., const bgController = new
AbortController()) and pass bgController.signal into
subagent.execute(contextState, bgController.signal), ensuring you do not forward
the request-scoped signal. Also wire bgController.abort() to whatever
agent-level shutdown lifecycle you already have (so background tasks can be
cancelled on explicit shutdown) and ensure you clean up the controller when the
subagent finishes; update any places constructing the background job (the
run_in_background / subagent.execute call) to use this independent controller.

Comment thread README.md Outdated
Comment on lines +164 to +174
proto supports [Langfuse](https://langfuse.com) tracing out of the box. Set three environment variables and every session is fully traced — LLM calls (all providers), tool executions, subagent lifecycles, and turn hierarchy.

### Setup

Add to `~/.proto/.env` or export in your shell:

```
LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_SECRET_KEY=sk-lf-...
LANGFUSE_BASE_URL=https://cloud.langfuse.com # optional, this is the default
```

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Clarify the required Langfuse vars and add a language to the fence.

This says "Set three environment variables" even though LANGFUSE_BASE_URL is optional in the example below. Line 170 also trips MD040 because the fence has no language.

Suggested doc fix
-proto supports [Langfuse](https://langfuse.com) tracing out of the box. Set three environment variables and every session is fully traced — LLM calls (all providers), tool executions, subagent lifecycles, and turn hierarchy.
+proto supports [Langfuse](https://langfuse.com) tracing out of the box. Set the two required environment variables below (and optionally `LANGFUSE_BASE_URL`) and every session is fully traced — LLM calls (all providers), tool executions, subagent lifecycles, and turn hierarchy.

-```
+```bash
 LANGFUSE_PUBLIC_KEY=pk-lf-...
 LANGFUSE_SECRET_KEY=sk-lf-...
 LANGFUSE_BASE_URL=https://cloud.langfuse.com  # optional, this is the default
🧰 Tools
🪛 markdownlint-cli2 (0.22.0)

[warning] 170-170: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 164 - 174, Clarify that only LANGFUSE_PUBLIC_KEY and
LANGFUSE_SECRET_KEY are required (LANGFUSE_BASE_URL is optional) and update the
code fence to specify the language; change the prose from "Set three environment
variables" to "Set the following environment variables (BASE_URL is optional)"
and add the bash language tag to the fenced block so it becomes ```bash with the
three env vars LANGFUSE_PUBLIC_KEY, LANGFUSE_SECRET_KEY, and LANGFUSE_BASE_URL
shown (noting BASE_URL is optional).

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
README.md (1)

164-174: ⚠️ Potential issue | 🟡 Minor

The past review comment remains unaddressed: clarify required vs. optional environment variables and add language to the code fence.

Line 164 still states "Set three environment variables" even though only LANGFUSE_PUBLIC_KEY and LANGFUSE_SECRET_KEY are required; LANGFUSE_BASE_URL is optional. Additionally, the code fence at line 170 still lacks a language specification, triggering the MD040 linting warning.

📝 Suggested fix
-proto supports [Langfuse](https://langfuse.com) tracing out of the box. Set three environment variables and every session is fully traced — LLM calls (all providers), tool executions, subagent lifecycles, and turn hierarchy.
+proto supports [Langfuse](https://langfuse.com) tracing out of the box. Set the required environment variables below (and optionally `LANGFUSE_BASE_URL`) and every session is fully traced — LLM calls (all providers), tool executions, subagent lifecycles, and turn hierarchy.

 ### Setup
 
 Add to `~/.proto/.env`:
 
-```
+```bash
 LANGFUSE_PUBLIC_KEY=pk-lf-...
 LANGFUSE_SECRET_KEY=sk-lf-...
 LANGFUSE_BASE_URL=https://cloud.langfuse.com  # optional, this is the default

As per coding guidelines, Markdown files should be formatted with Prettier, and standard Markdown linting (MD040) requires language tags on fenced code blocks.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 164 - 174, Update the README section so it correctly
states that only LANGFUSE_PUBLIC_KEY and LANGFUSE_SECRET_KEY are required while
LANGFUSE_BASE_URL is optional, change the phrase "Set three environment
variables" to reflect "Set the following environment variables (BASE_URL
optional)", and add a language tag to the fenced code block (e.g., ```bash) so
MD040 is satisfied; finally reformat the Markdown with Prettier to apply
linting/style fixes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@README.md`:
- Around line 164-174: Update the README section so it correctly states that
only LANGFUSE_PUBLIC_KEY and LANGFUSE_SECRET_KEY are required while
LANGFUSE_BASE_URL is optional, change the phrase "Set three environment
variables" to reflect "Set the following environment variables (BASE_URL
optional)", and add a language tag to the fenced code block (e.g., ```bash) so
MD040 is satisfied; finally reformat the Markdown with Prettier to apply
linting/style fixes.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8f1f6d1d-de09-4ec8-b959-33390745e573

📥 Commits

Reviewing files that changed from the base of the PR and between c35c157 and bf17b99.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (1)
  • README.md

GitHub CI and others added 2 commits April 1, 2026 23:38
When a request has no tools defined (e.g. /btw using full conversation
history) but the history contains tool call/result messages, Anthropic
and LiteLLM routing to Anthropic reject the request with a 400.

Strip tool-related messages from the history in that case rather than
letting the upstream provider fail.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
packages/core/src/core/openaiContentGenerator/pipeline.ts (1)

266-300: ⚠️ Potential issue | 🟠 Major

Ensure deferred streaming spans always close, including early iterator cancellation.

context.span is still finalized only in success/error paths of the async generator, while status OK is set before the stream is fully consumed. If the caller stops iteration early, cleanup can be skipped and the span can remain open with stale state. Move span finalization into a finally in processStreamWithLogging and only mark OK after successful completion.

#!/bin/bash
set -euo pipefail

FILE="packages/core/src/core/openaiContentGenerator/pipeline.ts"

echo "== Check stream lifecycle code =="
rg -n "processStreamWithLogging|SpanStatusCode\\.OK|context\\.span\\.end\\(|finally" "$FILE" -C 3

echo
echo "== Inspect processStreamWithLogging body =="
sed -n '125,320p' "$FILE" | cat -n

echo
echo "== Inspect executeWithErrorHandling span status/end flow =="
sed -n '553,675p' "$FILE" | cat -n

Also applies to: 305-309, 656-665

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/openaiContentGenerator/pipeline.ts` around lines 266 -
300, The OTel span for streaming (context.span) can be left open if iteration is
cancelled early; update processStreamWithLogging so span finalization (setting
status and calling context.span.end()) is moved into a finally block that always
runs, and only set SpanStatusCode.OK after the stream was fully and successfully
consumed (i.e., set OK in the normal completion path, not before consumption).
Concretely: wrap the streaming/iteration logic inside try { ... } catch { set
error status/attributes } finally { if (context.span && !context.spanEnded) {
set any remaining attributes/usage from lastResponse, add completion event if
needed, and call context.span.end() } } — reference the functions/processors
processStreamWithLogging, executeWithErrorHandling, and the context.span usages
(setAttributes, setAttribute, addEvent, end) to locate where to move the logic
and guard against double-ending the span.
🧹 Nitpick comments (1)
packages/core/src/core/openaiContentGenerator/pipeline.ts (1)

131-132: Bound streamed completion buffering for telemetry to avoid unbounded memory growth.

completionParts collects the full response, but emitted telemetry is truncated to 10k chars. Cap accumulation during streaming to reduce memory pressure on long outputs.

♻️ Suggested change
-    const completionParts: string[] = [];
+    const completionParts: string[] = [];
+    let completionChars = 0;
+    const COMPLETION_LOG_LIMIT = 10_000;

@@
-        if (context.logPrompts && delta?.content) {
-          completionParts.push(delta.content);
+        if (context.logPrompts && delta?.content && completionChars < COMPLETION_LOG_LIMIT) {
+          const remaining = COMPLETION_LOG_LIMIT - completionChars;
+          const chunk = delta.content.slice(0, remaining);
+          completionParts.push(chunk);
+          completionChars += chunk.length;
         }

@@
-          const responseText = completionParts.join('');
+          const responseText = completionParts.join('');
           context.span.addEvent('gen_ai.content.completion', {
             'gen_ai.completion':
-              responseText.length > 10_000
-                ? responseText.slice(0, 10_000) + '...[truncated]'
+              responseText.length >= COMPLETION_LOG_LIMIT
+                ? responseText + '...[truncated]'
                 : responseText,
           });

Also applies to: 166-169, 290-297

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/openaiContentGenerator/pipeline.ts` around lines 131 -
132, The streaming telemetry buffer completionParts is unbounded; introduce a
MAX_TELEMETRY_CHARS (e.g. 10_000) constant and ensure all places that append to
completionParts (the collection initialized as const completionParts: string[]
and the other append sites referenced around the same streaming handlers at the
later blocks) only accumulate up to that limit — either stop appending once the
total joined length reaches MAX_TELEMETRY_CHARS or only append a truncated slice
of the incoming chunk so the buffer never grows beyond MAX_TELEMETRY_CHARS;
apply the same cap to the other streaming accumulation sites (the blocks noted
at the subsequent locations) so emitted telemetry is still truncated to 10k but
in-memory buffering is bounded.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@packages/core/src/core/openaiContentGenerator/pipeline.ts`:
- Around line 266-300: The OTel span for streaming (context.span) can be left
open if iteration is cancelled early; update processStreamWithLogging so span
finalization (setting status and calling context.span.end()) is moved into a
finally block that always runs, and only set SpanStatusCode.OK after the stream
was fully and successfully consumed (i.e., set OK in the normal completion path,
not before consumption). Concretely: wrap the streaming/iteration logic inside
try { ... } catch { set error status/attributes } finally { if (context.span &&
!context.spanEnded) { set any remaining attributes/usage from lastResponse, add
completion event if needed, and call context.span.end() } } — reference the
functions/processors processStreamWithLogging, executeWithErrorHandling, and the
context.span usages (setAttributes, setAttribute, addEvent, end) to locate where
to move the logic and guard against double-ending the span.

---

Nitpick comments:
In `@packages/core/src/core/openaiContentGenerator/pipeline.ts`:
- Around line 131-132: The streaming telemetry buffer completionParts is
unbounded; introduce a MAX_TELEMETRY_CHARS (e.g. 10_000) constant and ensure all
places that append to completionParts (the collection initialized as const
completionParts: string[] and the other append sites referenced around the same
streaming handlers at the later blocks) only accumulate up to that limit —
either stop appending once the total joined length reaches MAX_TELEMETRY_CHARS
or only append a truncated slice of the incoming chunk so the buffer never grows
beyond MAX_TELEMETRY_CHARS; apply the same cap to the other streaming
accumulation sites (the blocks noted at the subsequent locations) so emitted
telemetry is still truncated to 10k but in-memory buffering is bounded.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d4765826-b49b-4d33-9c37-cbd4b45e1768

📥 Commits

Reviewing files that changed from the base of the PR and between bf17b99 and 69c29e7.

📒 Files selected for processing (1)
  • packages/core/src/core/openaiContentGenerator/pipeline.ts

@mabry1985 mabry1985 merged commit fb0793e into main Apr 2, 2026
1 of 3 checks passed
@mabry1985 mabry1985 deleted the feat/langfuse-tracing branch April 2, 2026 06:53
mabry1985 added a commit that referenced this pull request May 1, 2026
… example shape to description (#177)

Reported failure mode: smaller / older models (Qwen3, Minimax variants,
some open-weights routes) JSON-encode the entire `questions` array as a
string instead of emitting it as a native array literal. validateToolParams
threw with "Parameter \"questions\" must be an array" — useless feedback
since the model HAD sent the array, just stringified.

Three changes, layered:

1. Silent coercion in validateToolParams. If `questions` is a string that
   parses as JSON, parse it and continue. Logs a debugLogger.warn so the
   signal stays visible — silent coercion would mask a real upstream
   regression if model behavior shifts. Catches ~all of the user-reported
   failures without a retry round-trip.

2. Example shape added to the tool description. Models replicate concrete
   examples better than they synthesize from abstract schemas with 3
   levels of nesting. Placeholder text is clearly labeled as shape-only
   so models don't cargo-cult the example values into their actual
   questions.

3. Sharper error message for the residual case (non-JSON garbage in the
   string slot): "Pass `questions` as a real array literal, not a
   JSON-encoded string." Clear, specific, tells the model exactly which
   strategy to drop.

Considered but rejected:
- Schema relaxation (allow `options: string[]`, default multiSelect):
  API change, breaks downstream `option.description` consumers (dialog
  UI, ACP renderer), premature without data showing #1+#2 are
  insufficient.

Tests updated with two new cases: stringified-array coercion, non-JSON
string error path. 5319 core / 0 fail; lint + typecheck clean.

Co-authored-by: Automaker <automaker@localhost>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant