Follow-up to #4126 (Phase 1: unify span creation paths). PR #4126 is merged; these are non-blocking polish items found in the final review rounds.
Items
1. startLLMRequestSpan/startToolSpan fallback order (Codex P2)
Where: packages/core/src/telemetry/session-tracing.ts:204-206 (and parallel code in startToolSpan)
Current behavior: when interactionContext.getStore() is empty:
const ctx = parentCtx
? trace.setSpan(otelContext.active(), parentCtx.span)
: (getSessionContext() ?? otelContext.active());
Issue: falls back to synthetic session root BEFORE checking if otelContext.active() already has an active span. If an LLM/tool call happens inside a tool span (subagent inside a tool, or any nested LLM-inside-tool path) but outside an interaction span, the new child span re-parents to session root instead of the active tool span — flattens the trace tree.
Fix: mirror tracer.ts getParentContext(): prefer the active context if it has a span, only fall back to session root otherwise.
if (parentCtx) {
ctx = trace.setSpan(otelContext.active(), parentCtx.span);
} else {
const active = otelContext.active();
ctx = trace.getSpan(active) ? active : (getSessionContext() ?? active);
}
2. tool.execution span misreports cancellation as success when tool returns ToolResult on abort (Codex P2)
Where: packages/core/src/core/coreToolScheduler.ts:2009-2011
Current code:
const toolResult: ToolResult = await promise;
endToolExecutionSpan(execSpan, {
success: toolResult.error === undefined, // true when tool resolves cleanly on abort
});
if (signal.aborted) {
// marks parent tool span as cancelled, but execSpan is already closed as success
setToolSpanCancelled(span);
}
Issue: if a tool detects signal.aborted and resolves with { llmContent: 'cancelled' } (no error field), the execution sub-span ends with success: true while the parent tool span ends as cancelled — inconsistent in trace.
Fix: include !signal.aborted in the metadata, or move endToolExecutionSpan after the abort check.
3. test mock missing API_CALL_ABORTED_SPAN_STATUS_MESSAGE (Codex P2)
Where: packages/core/src/core/loggingContentGenerator/loggingContentGenerator.test.ts:109-111
Issue: Production code now imports API_CALL_ABORTED_SPAN_STATUS_MESSAGE from ../../telemetry/tracer.js, but this Vitest mock only defines API_CALL_FAILED_SPAN_STATUS_MESSAGE. Vitest doesn't crash on missing exports (returns `undefined`), so existing tests pass, but the abort branch (when exercised in tests) would pass `undefined` as the error message — coverage blind spot.
Fix: add API_CALL_ABORTED_SPAN_STATUS_MESSAGE: 'API call aborted' to the mock.
4. idle timeout vs api log inconsistency (PR #4126 review by @wenshao)
Where: packages/core/src/core/loggingContentGenerator/loggingContentGenerator.ts:454
Issue: When the 5-min stream idle timeout fires, it ends the LLM span with `success: false` / timeout. If the consumer later drains buffered chunks, safelyLogApiResponse still logs a "successful" API response. The span says "timed out / error" while the API log says "success" — contradictory signals during incident response.
Fix: When spanEndedByTimeout is true, also skip the success-log branch in the normal completion path.
Out of scope (tracked elsewhere)
Follow-up to #4126 (Phase 1: unify span creation paths). PR #4126 is merged; these are non-blocking polish items found in the final review rounds.
Items
1.
startLLMRequestSpan/startToolSpanfallback order (Codex P2)Where:
packages/core/src/telemetry/session-tracing.ts:204-206(and parallel code instartToolSpan)Current behavior: when
interactionContext.getStore()is empty:Issue: falls back to synthetic session root BEFORE checking if
otelContext.active()already has an active span. If an LLM/tool call happens inside a tool span (subagent inside a tool, or any nested LLM-inside-tool path) but outside an interaction span, the new child span re-parents to session root instead of the active tool span — flattens the trace tree.Fix: mirror
tracer.tsgetParentContext(): prefer the active context if it has a span, only fall back to session root otherwise.2. tool.execution span misreports cancellation as success when tool returns ToolResult on abort (Codex P2)
Where:
packages/core/src/core/coreToolScheduler.ts:2009-2011Current code:
Issue: if a tool detects
signal.abortedand resolves with{ llmContent: 'cancelled' }(noerrorfield), the execution sub-span ends withsuccess: truewhile the parent tool span ends as cancelled — inconsistent in trace.Fix: include
!signal.abortedin the metadata, or moveendToolExecutionSpanafter the abort check.3. test mock missing
API_CALL_ABORTED_SPAN_STATUS_MESSAGE(Codex P2)Where:
packages/core/src/core/loggingContentGenerator/loggingContentGenerator.test.ts:109-111Issue: Production code now imports
API_CALL_ABORTED_SPAN_STATUS_MESSAGEfrom../../telemetry/tracer.js, but this Vitest mock only definesAPI_CALL_FAILED_SPAN_STATUS_MESSAGE. Vitest doesn't crash on missing exports (returns `undefined`), so existing tests pass, but the abort branch (when exercised in tests) would pass `undefined` as the error message — coverage blind spot.Fix: add
API_CALL_ABORTED_SPAN_STATUS_MESSAGE: 'API call aborted'to the mock.4. idle timeout vs api log inconsistency (PR #4126 review by @wenshao)
Where:
packages/core/src/core/loggingContentGenerator/loggingContentGenerator.ts:454Issue: When the 5-min stream idle timeout fires, it ends the LLM span with `success: false` / timeout. If the consumer later drains buffered chunks,
safelyLogApiResponsestill logs a "successful" API response. The span says "timed out / error" while the API log says "success" — contradictory signals during incident response.Fix: When
spanEndedByTimeoutis true, also skip the success-log branch in the normal completion path.Out of scope (tracked elsewhere)
tool.blocked_on_userspan — Phase 2 in feat(telemetry): harden OpenTelemetry configuration, HTTP OTLP behavior, and runtime safety #3731hookspan — Phase 2 in feat(telemetry): harden OpenTelemetry configuration, HTTP OTLP behavior, and runtime safety #3731