chore: add CodeRabbit config by mabry1985 · Pull Request #2 · protoLabsAI/protoCLI

mabry1985 · 2026-04-02T06:00:28Z

Summary

.coderabbit.yaml: chill review profile, auto-review enabled on PRs targeting main
Branch protection on main already applied via API (requires Lint + CodeQL before merge)

Install the CodeRabbit GitHub app at https://github.com/apps/coderabbit-ai if not already done.

🤖 Generated with Claude Code

Summary by CodeRabbit

Documentation
- Updated environment configuration guidance for local development setups.
Improvements
- Enhanced telemetry logging with comprehensive token usage tracking and operation timing metrics for AI interactions.
- Improved API provider compatibility by filtering unnecessary tool-related messages.
Chores
- Added code review automation configuration for workflow efficiency.
- Updated build artifact exclusions and cleaned up linting directives.

Add full OTel span instrumentation so setting LANGFUSE_PUBLIC_KEY + LANGFUSE_SECRET_KEY gives end-to-end trace visibility in Langfuse. - LLM spans: all 3 providers (OpenAI-compat, Anthropic, Gemini) emit gen_ai spans with token usage attrs (input/output/total) for cost tracking - Turn hierarchy: per-turn root span in client.ts with context propagation so LLM/tool/agent spans nest correctly in Langfuse trace view - Tool spans: coreToolScheduler wraps every tool execution in a child span with name, type, decision, duration, and error attributes - Agent spans: agent.ts wraps subagent execution (foreground + background) with full lifecycle coverage - Content logging: prompt/response span events gated by telemetryLogPrompts - Tests: 26 new tests covering Langfuse activation and turn span lifecycle - Docs: README Observability section + AGENTS.md config note Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- .coderabbit.yaml: chill profile, auto-review on PRs targeting main - main branch now requires Lint + CodeQL to pass before merge Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai · 2026-04-02T06:01:27Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Configuration and documentation updates are added: CodeRabbit YAML for review automation, .gitignore rule for benchmark outputs, and Langfuse environment file guidance in README. The OpenAI pipeline is enhanced with streaming telemetry collection, improved span lifecycle management, and request validation. ESLint suppressions are removed.

Changes

Cohort / File(s)	Summary
Configuration & Build Files `.coderabbit.yaml`, `.gitignore`	New CodeRabbit configuration with review automation and profile settings; added `jobs/` directory to gitignore for benchmark outputs.
Documentation `README.md`	Added guidance on using `~/.proto/.env` for Langfuse configuration instead of project-level `.env` files.
Telemetry & Streaming Pipeline `packages/core/src/core/openaiContentGenerator/pipeline.ts`	Enhanced streaming pipeline with completion text accumulation, comprehensive OpenTelemetry span instrumentation (duration, token usage, model, completion events), improved span lifecycle management (deferred ending for streaming, immediate for non-streaming), request validation to filter tool messages when tools unavailable, and updated error handling with explicit span status and termination.
Code Cleanup `packages/core/src/utils/schemaValidator.ts`, `packages/webui/src/index.ts`	Removed ESLint `eslint-disable-next-line import/no-internal-modules` suppression comments.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The description is incomplete relative to the template; it lacks TLDR details about all changes, Screenshots/Demo section, Dive Deeper discussion, Reviewer Test Plan, and Testing Matrix. Only Summary and app installation note are provided.	Expand the description to include a comprehensive TLDR covering all file changes (README, pipeline.ts, schemaValidator.ts, index.ts, .gitignore), provide 'N/A — no user-facing change' for Screenshots, add Dive Deeper context, Reviewer Test Plan, and complete the Testing Matrix.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'chore: add CodeRabbit config' clearly and concisely describes the main change—adding a CodeRabbit configuration file. It accurately reflects the primary modification in the changeset.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/langfuse-tracing

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 8

🧹 Nitpick comments (2)

.gitignore (1)
2-2: Anchor the ignore rule to repo root to avoid accidental overmatching.

At Line 2, jobs/ ignores any directory named jobs anywhere in the tree. If this is meant only for root benchmark output, prefer a root-anchored rule.
Proposed tweak
- jobs/
+ /jobs/
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.gitignore at line 2, The .gitignore currently contains the pattern "jobs/"
which matches any directory named jobs anywhere in the repo; change this to a
root-anchored rule by replacing the pattern with "/jobs/" so it only ignores the
jobs directory at the repository root (update the existing "jobs/" entry to
"/jobs/").
packages/core/src/core/openaiContentGenerator/pipeline.ts (1)
131-132: Cap the streamed completion buffer at the truncation limit.

Line 291 only emits 10k characters, but completionParts keeps the full streamed response in memory until the end. For long outputs that creates an unnecessary second copy of the completion just for telemetry logging. Stop appending once the preview buffer reaches the event limit.

Also applies to: 166-169, 289-297
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/openaiContentGenerator/pipeline.ts` around lines 131 -
132, The streamed completion buffer completionParts currently accumulates the
entire response causing a second full copy in memory; limit its growth by
capping appended text to the telemetry preview/truncation limit (e.g., 10_000
chars): before pushing/concatenating each incoming chunk in the pipeline handler
(and the similar places referenced where completionParts is appended), check the
current accumulated length and only append up to (limit - currentLength)
characters from the chunk (or skip appending if already at/above the limit);
apply the same guard to all occurrences that build completionParts so the
preview buffer never exceeds the event truncation limit.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/core/src/core/client.ts`:
- Around line 964-972: The recursion in sendMessageStream forwards the original
options and causes a continuation to start a new root turn span (startTurnSpan)
even when an active turn for the same prompt_id already exists (ownsTurnSpan ===
true); update the continuation path so it reuses the existing active turn span
instead of starting a new one by either (A) guarding startTurnSpan with the
active-turn helper (check for an active turn/span for prompt_id before calling
startTurnSpan and only set ownsTurnSpan when you actually created the span), or
(B) mark the recursive continuation as a non-owning internal continuation (pass
a flag in options to indicate non-owning continuation so startTurnSpan is
skipped), and ensure ownsTurnSpan/endTurnSpan logic only runs when the span was
created by this invocation.
- Around line 477-485: Start the turn span with
startTurnSpan(this.config.getSessionId(), prompt_id) immediately before entering
the generator logic and ensure endTurnSpan() is called inside a finally block so
the turn span is always closed regardless of errors or early exits; update the
code around startTurnSpan/endTurnSpan in the generator to wrap the main body in
try { ... } finally { endTurnSpan() }. Also modify the recursive continuation
call that currently reuses the original options (the call near the recursion
around the symbol options/prompt continuation) to pass an explicit non-root type
such as { type: SendMessageType.Hook } (or another non-root SendMessageType)
instead of the original options so recursive continuations reuse the active span
instead of attempting to start a second root span for the same prompt_id.

In `@packages/core/src/core/coreToolScheduler.ts`:
- Around line 1477-1481: Record the tool duration on all non-success exit paths
by setting the same 'tool.duration_ms' attribute on the toolSpan before calling
setStatus or setAttribute for errors/cancellations; locate the span handling
code around toolSpan (where you currently call toolSpan.setStatus and
toolSpan.setAttribute('error.message', ...)) and compute the elapsed ms the same
way as the success branch, then call toolSpan.setAttribute('tool.duration_ms',
durationMs) for the cancelled, hook-blocked, and error branches (also apply
identical additions at the other occurrences around the file noted in the
review: the blocks near lines handling similar toolSpan exits at the other five
locations).

In `@packages/core/src/core/geminiContentGenerator/geminiContentGenerator.ts`:
- Around line 179-188: The code currently emits raw prompt/completion content
when logPrompts is truthy (see logPrompts, span.addEvent and
finalRequest.contents); update the implementation to require an explicit opt-in
flag (e.g., a clearly named allowTelemetryRawPrompts or
explicitTelemetryConsent) before serializing and adding span events, and ensure
the default for getTelemetryLogPromptsEnabled() is false so enabling Langfuse
does not automatically export sensitive payloads; apply the same change pattern
to the other locations referenced (lines ~196-208, ~248-257, ~303-312) so all
span.addEvent calls that include raw prompt/completion content are gated behind
the explicit opt-in flag.

In `@packages/core/src/core/openaiContentGenerator/pipeline.ts`:
- Around line 266-300: The OTel span attached to context.span in
processStreamWithLogging (used with lastResponse and completionParts) can remain
open if the consumer stops iterating early; move all span cleanup (setting final
attributes, adding events from completionParts, setting status) into a finally
block so context.span.end() is always executed regardless of normal completion,
error, or early generator termination, and ensure any existing
span.setAttributes / addEvent calls run before ending the span inside that
finally.

In `@packages/core/src/telemetry/turnSpanContext.ts`:
- Around line 21-23: The module-global mutable state activeTurnSpan and
activeTurnContext in turnSpanContext.ts causes cross-request races when
startTurnSpan() / endTurnSpan() are called concurrently; replace this
process-global singleton with per-execution storage (e.g. Node's
AsyncLocalStorage or explicit context objects threaded through calls) so each
turn gets its own Span and OTel Context; update startTurnSpan(), endTurnSpan(),
and any getters to read/write from the per-execution store (or accept/return a
context object) instead of using activeTurnSpan/activeTurnContext to prevent one
turn from stealing or ending another turn's span.

In `@packages/core/src/tools/agent.ts`:
- Around line 675-680: Background agents are currently passed the request-scoped
signal (the call to subagent.execute(contextState, signal)), which causes
background work to be cancelled when the originating request aborts; instead,
create a new, independent AbortController for background jobs (e.g., const
bgController = new AbortController()) and pass bgController.signal into
subagent.execute(contextState, bgController.signal), ensuring you do not forward
the request-scoped signal. Also wire bgController.abort() to whatever
agent-level shutdown lifecycle you already have (so background tasks can be
cancelled on explicit shutdown) and ensure you clean up the controller when the
subagent finishes; update any places constructing the background job (the
run_in_background / subagent.execute call) to use this independent controller.

In `@README.md`:
- Around line 164-174: Clarify that only LANGFUSE_PUBLIC_KEY and
LANGFUSE_SECRET_KEY are required (LANGFUSE_BASE_URL is optional) and update the
code fence to specify the language; change the prose from "Set three environment
variables" to "Set the following environment variables (BASE_URL is optional)"
and add the bash language tag to the fenced block so it becomes ```bash with the
three env vars LANGFUSE_PUBLIC_KEY, LANGFUSE_SECRET_KEY, and LANGFUSE_BASE_URL
shown (noting BASE_URL is optional).

---

Nitpick comments:
In @.gitignore:
- Line 2: The .gitignore currently contains the pattern "jobs/" which matches
any directory named jobs anywhere in the repo; change this to a root-anchored
rule by replacing the pattern with "/jobs/" so it only ignores the jobs
directory at the repository root (update the existing "jobs/" entry to
"/jobs/").

In `@packages/core/src/core/openaiContentGenerator/pipeline.ts`:
- Around line 131-132: The streamed completion buffer completionParts currently
accumulates the entire response causing a second full copy in memory; limit its
growth by capping appended text to the telemetry preview/truncation limit (e.g.,
10_000 chars): before pushing/concatenating each incoming chunk in the pipeline
handler (and the similar places referenced where completionParts is appended),
check the current accumulated length and only append up to (limit -
currentLength) characters from the chunk (or skip appending if already at/above
the limit); apply the same guard to all occurrences that build completionParts
so the preview buffer never exceeds the event truncation limit.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 72e77ed5-7c3a-4f4d-9bc5-a414bbfe5607

📥 Commits

Reviewing files that changed from the base of the PR and between 6a61a19 and c35c157.

📒 Files selected for processing (16)

.coderabbit.yaml
.gitignore
AGENTS.md
README.md
packages/core/src/core/anthropicContentGenerator/anthropicContentGenerator.ts
packages/core/src/core/client.ts
packages/core/src/core/coreToolScheduler.ts
packages/core/src/core/geminiContentGenerator/geminiContentGenerator.ts
packages/core/src/core/geminiContentGenerator/index.ts
packages/core/src/core/openaiContentGenerator/errorHandler.ts
packages/core/src/core/openaiContentGenerator/pipeline.ts
packages/core/src/telemetry/index.ts
packages/core/src/telemetry/sdk.test.ts
packages/core/src/telemetry/turnSpanContext.test.ts
packages/core/src/telemetry/turnSpanContext.ts
packages/core/src/tools/agent.ts

coderabbitai · 2026-04-02T06:14:29Z

+    // Start a turn root span for new user prompts. Recursive continuations
+    // (Hook, ToolResult, Retry) reuse the already-active span.
+    const ownsTurnSpan =
+      messageType === SendMessageType.UserQuery ||
+      messageType === SendMessageType.Cron;
+    if (ownsTurnSpan) {
+      startTurnSpan(this.config.getSessionId(), prompt_id);
+    }
+


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's check the file size and then read the relevant sections wc -l packages/core/src/core/client.ts

Repository: protoLabsAI/protoCLI

Length of output: 101

🏁 Script executed:

# Read around the span opening (lines 470-495) sed -n '470,495p' packages/core/src/core/client.ts | cat -n

Repository: protoLabsAI/protoCLI

Length of output: 1227

🏁 Script executed:

# Read the method context - need to find the method boundaries sed -n '450,500p' packages/core/src/core/client.ts | cat -n

Repository: protoLabsAI/protoCLI

Length of output: 2055

🏁 Script executed:

# Search for endTurnSpan calls to see the cleanup pattern rg "endTurnSpan" packages/core/src/core/client.ts -B 2 -A 1

Repository: protoLabsAI/protoCLI

Length of output: 2284

🏁 Script executed:

# Read the full method to understand the structure and exception handling sed -n '437,600p' packages/core/src/core/client.ts | cat -n

Repository: protoLabsAI/protoCLI

Length of output: 6722

🏁 Script executed:

# Check for try/catch blocks around the span opening sed -n '477,700p' packages/core/src/core/client.ts | cat -n

Repository: protoLabsAI/protoCLI

Length of output: 9728

🏁 Script executed:

# Look at lines 964-972 mentioned in the scratchpad for recursive continuation logic sed -n '960,980p' packages/core/src/core/client.ts | cat -n

Repository: protoLabsAI/protoCLI

Length of output: 990

🏁 Script executed:

# Check for exception handlers in the entire method rg "catch|finally|throw" packages/core/src/core/client.ts -n | grep -A 2 -B 2 "sendMessageStream"

Repository: protoLabsAI/protoCLI

Length of output: 46

🏁 Script executed:

# Read more of the method to see the complete structure sed -n '437,1000p' packages/core/src/core/client.ts | tail -400 | cat -n

Repository: protoLabsAI/protoCLI

Length of output: 16972

🏁 Script executed:

# Check if there's a try/catch/finally wrapping the method body or span cleanup rg "try\s*\{|catch\s*\(|finally\s*\{" packages/core/src/core/client.ts -n

Repository: protoLabsAI/protoCLI

Length of output: 247

🏁 Script executed:

# Look for the method ending to understand the full scope sed -n '800,900p' packages/core/src/core/client.ts | cat -n

Repository: protoLabsAI/protoCLI

Length of output: 4681

🏁 Script executed:

# Let me verify the exact line numbers for the critical sections sed -n '910,920p' packages/core/src/core/client.ts | cat -n

Repository: protoLabsAI/protoCLI

Length of output: 601

🏁 Script executed:

# Verify the recursive next speaker call to see if options are being reused sed -n '1000,1010p' packages/core/src/core/client.ts | cat -n

Repository: protoLabsAI/protoCLI

Length of output: 561

🏁 Script executed:

# Double-check that the entire method lacks a top-level try/catch/finally awk '/async \*sendMessageStream/,/^ }$/' packages/core/src/core/client.ts | head -20

Repository: protoLabsAI/protoCLI

Length of output: 804

🏁 Script executed:

# Count the structure - verify that no try/catch wraps the whole body sed -n '437,1002p' packages/core/src/core/client.ts | grep -E "^\s{4}(try|catch|finally)\s*(\{|$)" | head -10

Repository: protoLabsAI/protoCLI

Length of output: 73

Wrap the turn span lifecycle in try/finally to guarantee cleanup regardless of how the generator exits.

The span opened at line 484 is only closed on explicit return branches. Any unhandled rejection from hooks, compression, arena control, checkNextSpeaker(), or callers that stop consuming the generator early will leave the active turn span open and leak telemetry context. Move startTurnSpan() before a try block and place endTurnSpan() in a finally block to ensure cleanup on all exit paths.

Additionally, the recursive continuation at line 364 passes the original options unchanged. If the initial call was UserQuery or Cron, the recursive call will incorrectly attempt to start a second root span for the same prompt_id, violating the design intent that "Recursive continuations reuse the already-active span." Change the recursive call to explicitly pass { type: SendMessageType.Hook } or another non-root-span type to match the pattern used elsewhere (e.g., lines 196, 241, 315).

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/core/src/core/client.ts` around lines 477 - 485, Start the turn span with startTurnSpan(this.config.getSessionId(), prompt_id) immediately before entering the generator logic and ensure endTurnSpan() is called inside a finally block so the turn span is always closed regardless of errors or early exits; update the code around startTurnSpan/endTurnSpan in the generator to wrap the main body in try { ... } finally { endTurnSpan() }. Also modify the recursive continuation call that currently reuses the original options (the call near the recursion around the symbol options/prompt continuation) to pass an explicit non-root type such as { type: SendMessageType.Hook } (or another non-root SendMessageType) instead of the original options so recursive continuations reuse the active span instead of attempting to start a second root span for the same prompt_id.

coderabbitai · 2026-04-02T06:14:29Z

+        const nextResult = yield* this.sendMessageStream(
          nextRequest,
          signal,
          prompt_id,
          options,
          boundedTurns - 1,
        );
+        if (ownsTurnSpan) endTurnSpan('ok');
+        return nextResult;


⚠️ Potential issue | 🟠 Major

Reuse the active turn span on model continuations.

Line 964 forwards the original options, so a continuation from a UserQuery/Cron re-enters this method with ownsTurnSpan === true and starts another root turn span for the same prompt_id. That splits one user turn across multiple roots. Guard startTurnSpan() with the active-turn helper, or switch this recursive path to an internal non-owning continuation type.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/core/src/core/client.ts` around lines 964 - 972, The recursion in sendMessageStream forwards the original options and causes a continuation to start a new root turn span (startTurnSpan) even when an active turn for the same prompt_id already exists (ownsTurnSpan === true); update the continuation path so it reuses the existing active turn span instead of starting a new one by either (A) guarding startTurnSpan with the active-turn helper (check for an active turn/span for prompt_id before calling startTurnSpan and only set ownsTurnSpan when you actually created the span), or (B) mark the recursive continuation as a non-owning internal continuation (pass a flag in options to indicate non-owning continuation so startTurnSpan is skipped), and ensure ownsTurnSpan/endTurnSpan logic only runs when the span was created by this invocation.

coderabbitai · 2026-04-02T06:14:29Z

+        toolSpan.setStatus({ code: SpanStatusCode.ERROR });
+        toolSpan.setAttribute(
+          'error.message',
+          'User cancelled tool execution.',
+        );


⚠️ Potential issue | 🟡 Minor

Record tool.duration_ms on non-success paths too.

Only the success branch sets tool.duration_ms today. Cancelled, hook-blocked, and error spans end without any latency attribute, which makes the slowest failures invisible in traces.

Example fix

- // OTel: record success attributes - const durationMs = Date.now() - execStartTime; - toolSpan.setAttribute('tool.duration_ms', durationMs); + // OTel: record success attributes if (contentLength !== undefined) { toolSpan.setAttribute('tool.content_length', contentLength); } toolSpan.setStatus({ code: SpanStatusCode.OK }); @@ } finally { + toolSpan.setAttribute('tool.duration_ms', Date.now() - execStartTime); toolSpan.end(); }

Also applies to: 1524-1525, 1541-1547, 1577-1579, 1613-1617, 1654-1655

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/core/src/core/coreToolScheduler.ts` around lines 1477 - 1481, Record the tool duration on all non-success exit paths by setting the same 'tool.duration_ms' attribute on the toolSpan before calling setStatus or setAttribute for errors/cancellations; locate the span handling code around toolSpan (where you currently call toolSpan.setStatus and toolSpan.setAttribute('error.message', ...)) and compute the elapsed ms the same way as the success branch, then call toolSpan.setAttribute('tool.duration_ms', durationMs) for the cancelled, hook-blocked, and error branches (also apply identical additions at the other occurrences around the file noted in the review: the blocks near lines handling similar toolSpan exits at the other five locations).

coderabbitai · 2026-04-02T06:14:29Z

+      // Log prompt content as a span event when telemetry prompt logging is enabled
+      if (logPrompts) {
+        const promptJson = JSON.stringify(finalRequest.contents);
+        span.addEvent('gen_ai.content.prompt', {
+          'gen_ai.prompt':
+            promptJson.length > 10_000
+              ? promptJson.slice(0, 10_000) + '...[truncated]'
+              : promptJson,
+        });
+      }


⚠️ Potential issue | 🟠 Major

Make prompt/completion telemetry explicit opt-in.

packages/core/src/config/config.ts, Lines 1772-1774 default getTelemetryLogPromptsEnabled() to true, so enabling Langfuse starts exporting raw prompts and completions by default. In this repo those payloads can contain source files, secrets, and user data, so this is a risky telemetry default. Please flip the default or require an explicit enable before emitting these span events. The Anthropic generator added the same path.

Also applies to: 196-208, 248-257, 303-312

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/core/src/core/geminiContentGenerator/geminiContentGenerator.ts` around lines 179 - 188, The code currently emits raw prompt/completion content when logPrompts is truthy (see logPrompts, span.addEvent and finalRequest.contents); update the implementation to require an explicit opt-in flag (e.g., a clearly named allowTelemetryRawPrompts or explicitTelemetryConsent) before serializing and adding span events, and ensure the default for getTelemetryLogPromptsEnabled() is false so enabling Langfuse does not automatically export sensitive payloads; apply the same change pattern to the other locations referenced (lines ~196-208, ~248-257, ~303-312) so all span.addEvent calls that include raw prompt/completion content are gated behind the explicit opt-in flag.

coderabbitai · 2026-04-02T06:14:29Z

+      // Set token usage attributes on the OTel span now that the stream
+      // has been fully consumed and usage data is available.
+      if (context.span) {
+        context.span.setAttributes({
+          'llm.duration_ms': context.duration,
+        });
+        const usage = lastResponse?.usageMetadata;
+        if (usage) {
+          const inputTokens = usage.promptTokenCount ?? 0;
+          const outputTokens = usage.candidatesTokenCount ?? 0;
+          context.span.setAttributes({
+            'gen_ai.usage.input_tokens': inputTokens,
+            'gen_ai.usage.output_tokens': outputTokens,
+            'gen_ai.usage.total_tokens':
+              usage.totalTokenCount ?? inputTokens + outputTokens,
+          });
+        }
+        if (lastResponse?.modelVersion) {
+          context.span.setAttribute(
+            'gen_ai.response.model',
+            lastResponse.modelVersion,
+          );
+        }
+        // Log streamed completion content as a span event
+        if (context.logPrompts && completionParts.length > 0) {
+          const responseText = completionParts.join('');
+          context.span.addEvent('gen_ai.content.completion', {
+            'gen_ai.completion':
+              responseText.length > 10_000
+                ? responseText.slice(0, 10_000) + '...[truncated]'
+                : responseText,
+          });
+        }
+        context.span.end();
+      }


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

wc -l packages/core/src/core/openaiContentGenerator/pipeline.ts

Repository: protoLabsAI/protoCLI

Length of output: 125

🏁 Script executed:

# First, look at the specific lines mentioned (266-300) sed -n '266,300p' packages/core/src/core/openaiContentGenerator/pipeline.ts | cat -n

Repository: protoLabsAI/protoCLI

Length of output: 1697

🏁 Script executed:

# Check lines 305-309 mentioned as "Also applies to" sed -n '305,309p' packages/core/src/core/openaiContentGenerator/pipeline.ts | cat -n

Repository: protoLabsAI/protoCLI

Length of output: 293

🏁 Script executed:

# Check lines 633-644 mentioned as "Also applies to" sed -n '633,644p' packages/core/src/core/openaiContentGenerator/pipeline.ts | cat -n

Repository: protoLabsAI/protoCLI

Length of output: 629

🏁 Script executed:

# Find the executeWithErrorHandling function rg -n "executeWithErrorHandling" packages/core/src/core/openaiContentGenerator/pipeline.ts -A 20 | head -50

Repository: protoLabsAI/protoCLI

Length of output: 1520

🏁 Script executed:

# Find the processStreamWithLogging function rg -n "processStreamWithLogging" packages/core/src/core/openaiContentGenerator/pipeline.ts -A 30 | head -60

Repository: protoLabsAI/protoCLI

Length of output: 2790

🏁 Script executed:

# Get the complete processStreamWithLogging function sed -n '125,300p' packages/core/src/core/openaiContentGenerator/pipeline.ts | cat -n

Repository: protoLabsAI/protoCLI

Length of output: 8606

🏁 Script executed:

# Also check what comes after line 300 to see error handling and finally blocks sed -n '290,320p' packages/core/src/core/openaiContentGenerator/pipeline.ts | cat -n

Repository: protoLabsAI/protoCLI

Length of output: 1376

🏁 Script executed:

# Get the full executeWithErrorHandling to see the complete flow sed -n '530,680p' packages/core/src/core/openaiContentGenerator/pipeline.ts | cat -n

Repository: protoLabsAI/protoCLI

Length of output: 5970

Move span cleanup to a finally block to ensure closure on early generator termination.

The span is marked with status OK at line 633 before the async generator is consumed, then passed to processStreamWithLogging via context.span. While normal completion (line 266) and error paths (line 305) both call span.end(), early termination—when the caller stops iterating without exhausting the stream—bypasses both paths, leaving the span open with a stale success status. Add a finally block in processStreamWithLogging to guarantee the span closes regardless of how iteration ends.

Also applies to: 305–309, 633–644

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/core/src/core/openaiContentGenerator/pipeline.ts` around lines 266 - 300, The OTel span attached to context.span in processStreamWithLogging (used with lastResponse and completionParts) can remain open if the consumer stops iterating early; move all span cleanup (setting final attributes, adding events from completionParts, setting status) into a finally block so context.span.end() is always executed regardless of normal completion, error, or early generator termination, and ensure any existing span.setAttributes / addEvent calls run before ending the span inside that finally.

coderabbitai · 2026-04-02T06:14:30Z

+// Module-level state: the active turn span and its OTel context
+let activeTurnSpan: Span | undefined;
+let activeTurnContext: Context | undefined;


⚠️ Potential issue | 🟠 Major

Don't store the active turn span in a module-global singleton.

packages/core/src/core/client.ts, Lines 468-485 call startTurnSpan() for every user/cron turn. With activeTurnSpan and activeTurnContext as process-global state, two overlapping turns race: the later call steals the active context, and endTurnSpan() from either turn can close the other span. That cross-wires LLM, tool, and agent traces between requests. This needs per-execution storage (OTel context / AsyncLocalStorage) or explicit context threading instead of global mutable state.

Also applies to: 30-45

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/core/src/telemetry/turnSpanContext.ts` around lines 21 - 23, The module-global mutable state activeTurnSpan and activeTurnContext in turnSpanContext.ts causes cross-request races when startTurnSpan() / endTurnSpan() are called concurrently; replace this process-global singleton with per-execution storage (e.g. Node's AsyncLocalStorage or explicit context objects threaded through calls) so each turn gets its own Span and OTel Context; update startTurnSpan(), endTurnSpan(), and any getters to read/write from the per-execution store (or accept/return a context object) instead of using activeTurnSpan/activeTurnContext to prevent one turn from stealing or ending another turn's span.

coderabbitai · 2026-04-02T06:14:30Z

+        // Fire and forget — execution runs independently.
+        // The span covers the full async lifetime: it ends when the
+        // background agent completes (or fails), not when execute() returns.
        subagent
          .execute(contextState, signal)
          .then(() => {


⚠️ Potential issue | 🟠 Major

Background agents still inherit the foreground abort signal.

This path is described as fire-and-forget work, but it still passes the request-scoped signal into subagent.execute(). If the originating turn is aborted after launch, the background agent is cancelled too, which breaks the independence contract for run_in_background. Give background jobs their own controller/shutdown signal instead.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/core/src/tools/agent.ts` around lines 675 - 680, Background agents are currently passed the request-scoped signal (the call to subagent.execute(contextState, signal)), which causes background work to be cancelled when the originating request aborts; instead, create a new, independent AbortController for background jobs (e.g., const bgController = new AbortController()) and pass bgController.signal into subagent.execute(contextState, bgController.signal), ensuring you do not forward the request-scoped signal. Also wire bgController.abort() to whatever agent-level shutdown lifecycle you already have (so background tasks can be cancelled on explicit shutdown) and ensure you clean up the controller when the subagent finishes; update any places constructing the background job (the run_in_background / subagent.execute call) to use this independent controller.

coderabbitai · 2026-04-02T06:14:30Z

+proto supports [Langfuse](https://langfuse.com) tracing out of the box. Set three environment variables and every session is fully traced — LLM calls (all providers), tool executions, subagent lifecycles, and turn hierarchy.
+
+### Setup
+
+Add to `~/.proto/.env` or export in your shell:
+
+```
+LANGFUSE_PUBLIC_KEY=pk-lf-...
+LANGFUSE_SECRET_KEY=sk-lf-...
+LANGFUSE_BASE_URL=https://cloud.langfuse.com  # optional, this is the default
+```


⚠️ Potential issue | 🟡 Minor

Clarify the required Langfuse vars and add a language to the fence.

This says "Set three environment variables" even though LANGFUSE_BASE_URL is optional in the example below. Line 170 also trips MD040 because the fence has no language.

Suggested doc fix

-proto supports [Langfuse](https://langfuse.com) tracing out of the box. Set three environment variables and every session is fully traced — LLM calls (all providers), tool executions, subagent lifecycles, and turn hierarchy. +proto supports [Langfuse](https://langfuse.com) tracing out of the box. Set the two required environment variables below (and optionally `LANGFUSE_BASE_URL`) and every session is fully traced — LLM calls (all providers), tool executions, subagent lifecycles, and turn hierarchy. -``` +```bash LANGFUSE_PUBLIC_KEY=pk-lf-... LANGFUSE_SECRET_KEY=sk-lf-... LANGFUSE_BASE_URL=https://cloud.langfuse.com # optional, this is the default

🧰 Tools

🪛 markdownlint-cli2 (0.22.0)

[warning] 170-170: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@README.md` around lines 164 - 174, Clarify that only LANGFUSE_PUBLIC_KEY and LANGFUSE_SECRET_KEY are required (LANGFUSE_BASE_URL is optional) and update the code fence to specify the language; change the prose from "Set three environment variables" to "Set the following environment variables (BASE_URL is optional)" and add the bash language tag to the fenced block so it becomes ```bash with the three env vars LANGFUSE_PUBLIC_KEY, LANGFUSE_SECRET_KEY, and LANGFUSE_BASE_URL shown (noting BASE_URL is optional).

coderabbitai

♻️ Duplicate comments (1)

README.md (1)
164-174: ⚠️ Potential issue | 🟡 Minor

The past review comment remains unaddressed: clarify required vs. optional environment variables and add language to the code fence.

Line 164 still states "Set three environment variables" even though only LANGFUSE_PUBLIC_KEY and LANGFUSE_SECRET_KEY are required; LANGFUSE_BASE_URL is optional. Additionally, the code fence at line 170 still lacks a language specification, triggering the MD040 linting warning.
📝 Suggested fix
-proto supports [Langfuse](https://langfuse.com) tracing out of the box. Set three environment variables and every session is fully traced — LLM calls (all providers), tool executions, subagent lifecycles, and turn hierarchy.
+proto supports [Langfuse](https://langfuse.com) tracing out of the box. Set the required environment variables below (and optionally `LANGFUSE_BASE_URL`) and every session is fully traced — LLM calls (all providers), tool executions, subagent lifecycles, and turn hierarchy.

 ### Setup
 
 Add to `~/.proto/.env`:
 
-```
+```bash
 LANGFUSE_PUBLIC_KEY=pk-lf-...
 LANGFUSE_SECRET_KEY=sk-lf-...
 LANGFUSE_BASE_URL=https://cloud.langfuse.com  # optional, this is the default
As per coding guidelines, Markdown files should be formatted with Prettier, and standard Markdown linting (MD040) requires language tags on fenced code blocks.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 164 - 174, Update the README section so it correctly
states that only LANGFUSE_PUBLIC_KEY and LANGFUSE_SECRET_KEY are required while
LANGFUSE_BASE_URL is optional, change the phrase "Set three environment
variables" to reflect "Set the following environment variables (BASE_URL
optional)", and add a language tag to the fenced code block (e.g., ```bash) so
MD040 is satisfied; finally reformat the Markdown with Prettier to apply
linting/style fixes.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@README.md`:
- Around line 164-174: Update the README section so it correctly states that
only LANGFUSE_PUBLIC_KEY and LANGFUSE_SECRET_KEY are required while
LANGFUSE_BASE_URL is optional, change the phrase "Set three environment
variables" to reflect "Set the following environment variables (BASE_URL
optional)", and add a language tag to the fenced code block (e.g., ```bash) so
MD040 is satisfied; finally reformat the Markdown with Prettier to apply
linting/style fixes.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8f1f6d1d-de09-4ec8-b959-33390745e573

📥 Commits

Reviewing files that changed from the base of the PR and between c35c157 and bf17b99.

⛔ Files ignored due to path filters (1)

package-lock.json is excluded by !**/package-lock.json

📒 Files selected for processing (1)

README.md

When a request has no tools defined (e.g. /btw using full conversation history) but the history contains tool call/result messages, Anthropic and LiteLLM routing to Anthropic reject the request with a 400. Strip tool-related messages from the history in that case rather than letting the upstream provider fail. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai

♻️ Duplicate comments (1)

packages/core/src/core/openaiContentGenerator/pipeline.ts (1)

266-300: ⚠️ Potential issue | 🟠 Major

Ensure deferred streaming spans always close, including early iterator cancellation.

context.span is still finalized only in success/error paths of the async generator, while status OK is set before the stream is fully consumed. If the caller stops iteration early, cleanup can be skipped and the span can remain open with stale state. Move span finalization into a finally in processStreamWithLogging and only mark OK after successful completion.

#!/bin/bash
set -euo pipefail

FILE="packages/core/src/core/openaiContentGenerator/pipeline.ts"

echo "== Check stream lifecycle code =="
rg -n "processStreamWithLogging|SpanStatusCode\\.OK|context\\.span\\.end\\(|finally" "$FILE" -C 3

echo
echo "== Inspect processStreamWithLogging body =="
sed -n '125,320p' "$FILE" | cat -n

echo
echo "== Inspect executeWithErrorHandling span status/end flow =="
sed -n '553,675p' "$FILE" | cat -n

Also applies to: 305-309, 656-665

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/openaiContentGenerator/pipeline.ts` around lines 266 -
300, The OTel span for streaming (context.span) can be left open if iteration is
cancelled early; update processStreamWithLogging so span finalization (setting
status and calling context.span.end()) is moved into a finally block that always
runs, and only set SpanStatusCode.OK after the stream was fully and successfully
consumed (i.e., set OK in the normal completion path, not before consumption).
Concretely: wrap the streaming/iteration logic inside try { ... } catch { set
error status/attributes } finally { if (context.span && !context.spanEnded) {
set any remaining attributes/usage from lastResponse, add completion event if
needed, and call context.span.end() } } — reference the functions/processors
processStreamWithLogging, executeWithErrorHandling, and the context.span usages
(setAttributes, setAttribute, addEvent, end) to locate where to move the logic
and guard against double-ending the span.

🧹 Nitpick comments (1)

packages/core/src/core/openaiContentGenerator/pipeline.ts (1)

131-132: Bound streamed completion buffering for telemetry to avoid unbounded memory growth.

completionParts collects the full response, but emitted telemetry is truncated to 10k chars. Cap accumulation during streaming to reduce memory pressure on long outputs.

♻️ Suggested change

-    const completionParts: string[] = [];
+    const completionParts: string[] = [];
+    let completionChars = 0;
+    const COMPLETION_LOG_LIMIT = 10_000;

@@
-        if (context.logPrompts && delta?.content) {
-          completionParts.push(delta.content);
+        if (context.logPrompts && delta?.content && completionChars < COMPLETION_LOG_LIMIT) {
+          const remaining = COMPLETION_LOG_LIMIT - completionChars;
+          const chunk = delta.content.slice(0, remaining);
+          completionParts.push(chunk);
+          completionChars += chunk.length;
         }

@@
-          const responseText = completionParts.join('');
+          const responseText = completionParts.join('');
           context.span.addEvent('gen_ai.content.completion', {
             'gen_ai.completion':
-              responseText.length > 10_000
-                ? responseText.slice(0, 10_000) + '...[truncated]'
+              responseText.length >= COMPLETION_LOG_LIMIT
+                ? responseText + '...[truncated]'
                 : responseText,
           });

Also applies to: 166-169, 290-297

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/core/openaiContentGenerator/pipeline.ts` around lines 131 -
132, The streaming telemetry buffer completionParts is unbounded; introduce a
MAX_TELEMETRY_CHARS (e.g. 10_000) constant and ensure all places that append to
completionParts (the collection initialized as const completionParts: string[]
and the other append sites referenced around the same streaming handlers at the
later blocks) only accumulate up to that limit — either stop appending once the
total joined length reaches MAX_TELEMETRY_CHARS or only append a truncated slice
of the incoming chunk so the buffer never grows beyond MAX_TELEMETRY_CHARS;
apply the same cap to the other streaming accumulation sites (the blocks noted
at the subsequent locations) so emitted telemetry is still truncated to 10k but
in-memory buffering is bounded.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@packages/core/src/core/openaiContentGenerator/pipeline.ts`:
- Around line 266-300: The OTel span for streaming (context.span) can be left
open if iteration is cancelled early; update processStreamWithLogging so span
finalization (setting status and calling context.span.end()) is moved into a
finally block that always runs, and only set SpanStatusCode.OK after the stream
was fully and successfully consumed (i.e., set OK in the normal completion path,
not before consumption). Concretely: wrap the streaming/iteration logic inside
try { ... } catch { set error status/attributes } finally { if (context.span &&
!context.spanEnded) { set any remaining attributes/usage from lastResponse, add
completion event if needed, and call context.span.end() } } — reference the
functions/processors processStreamWithLogging, executeWithErrorHandling, and the
context.span usages (setAttributes, setAttribute, addEvent, end) to locate where
to move the logic and guard against double-ending the span.

---

Nitpick comments:
In `@packages/core/src/core/openaiContentGenerator/pipeline.ts`:
- Around line 131-132: The streaming telemetry buffer completionParts is
unbounded; introduce a MAX_TELEMETRY_CHARS (e.g. 10_000) constant and ensure all
places that append to completionParts (the collection initialized as const
completionParts: string[] and the other append sites referenced around the same
streaming handlers at the later blocks) only accumulate up to that limit —
either stop appending once the total joined length reaches MAX_TELEMETRY_CHARS
or only append a truncated slice of the incoming chunk so the buffer never grows
beyond MAX_TELEMETRY_CHARS; apply the same cap to the other streaming
accumulation sites (the blocks noted at the subsequent locations) so emitted
telemetry is still truncated to 10k but in-memory buffering is bounded.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d4765826-b49b-4d33-9c37-cbd4b45e1768

📥 Commits

Reviewing files that changed from the base of the PR and between bf17b99 and 69c29e7.

📒 Files selected for processing (1)

packages/core/src/core/openaiContentGenerator/pipeline.ts

… example shape to description (#177) Reported failure mode: smaller / older models (Qwen3, Minimax variants, some open-weights routes) JSON-encode the entire `questions` array as a string instead of emitting it as a native array literal. validateToolParams threw with "Parameter \"questions\" must be an array" — useless feedback since the model HAD sent the array, just stringified. Three changes, layered: 1. Silent coercion in validateToolParams. If `questions` is a string that parses as JSON, parse it and continue. Logs a debugLogger.warn so the signal stays visible — silent coercion would mask a real upstream regression if model behavior shifts. Catches ~all of the user-reported failures without a retry round-trip. 2. Example shape added to the tool description. Models replicate concrete examples better than they synthesize from abstract schemas with 3 levels of nesting. Placeholder text is clearly labeled as shape-only so models don't cargo-cult the example values into their actual questions. 3. Sharper error message for the residual case (non-JSON garbage in the string slot): "Pass `questions` as a real array literal, not a JSON-encoded string." Clear, specific, tells the model exactly which strategy to drop. Considered but rejected: - Schema relaxation (allow `options: string[]`, default multiSelect): API change, breaks downstream `option.description` consumers (dialog UI, ACP renderer), premature without data showing #1+#2 are insufficient. Tests updated with two new cases: stringified-array coercion, non-JSON string error path. 5319 core / 0 fail; lint + typecheck clean. Co-authored-by: Automaker <automaker@localhost> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

GitHub CI and others added 3 commits April 1, 2026 22:57

chore: ignore benchmark jobs output directory

89de5df

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

chore: add CodeRabbit config + branch protection on main

c35c157

- .coderabbit.yaml: chill profile, auto-review on PRs targeting main - main branch now requires Lint + CodeQL to pass before merge Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

GitHub CI and others added 2 commits April 1, 2026 23:07

chore: sync package-lock.json

7d238b1

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

docs: warn against project-level .env for Langfuse keys

bf17b99

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai Bot reviewed Apr 2, 2026

View reviewed changes

GitHub CI and others added 2 commits April 1, 2026 23:38

chore: remove stale eslint-disable comments

76bcf68

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai Bot reviewed Apr 2, 2026

View reviewed changes

mabry1985 and others added 2 commits April 1, 2026 23:50

Merge branch 'main' into feat/langfuse-tracing

0e04b94

update

2987424

mabry1985 merged commit fb0793e into main Apr 2, 2026
1 of 3 checks passed

mabry1985 deleted the feat/langfuse-tracing branch April 2, 2026 06:53

mabry1985 mentioned this pull request May 1, 2026

fix(ask_user_question): recover from stringified questions array, add example shape #177

Merged

6 tasks

protoquinn Bot mentioned this pull request May 26, 2026

Agentic loop terminates after the planning turn — agent emits intent then stops without executing tools #307

Closed

This was referenced Jun 1, 2026

Streaming requests aborted at ~5min → "Model stream ended with empty response text" (client transport timeout) #349

Closed

E2E test suite has been failing every run for ~1 week #226

Closed

Conversation

mabry1985 commented Apr 2, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mabry1985 commented Apr 2, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 2, 2026 •

edited

Loading