Skip to content

follow-up(telemetry): Phase 1.5 polish — fallback order, exec span on abort-as-result, mock + log/span consistency #4212

@doudouOUC

Description

@doudouOUC

Follow-up to #4126 (Phase 1: unify span creation paths). PR #4126 is merged; these are non-blocking polish items found in the final review rounds.

Items

1. startLLMRequestSpan/startToolSpan fallback order (Codex P2)

Where: packages/core/src/telemetry/session-tracing.ts:204-206 (and parallel code in startToolSpan)

Current behavior: when interactionContext.getStore() is empty:

const ctx = parentCtx
  ? trace.setSpan(otelContext.active(), parentCtx.span)
  : (getSessionContext() ?? otelContext.active());

Issue: falls back to synthetic session root BEFORE checking if otelContext.active() already has an active span. If an LLM/tool call happens inside a tool span (subagent inside a tool, or any nested LLM-inside-tool path) but outside an interaction span, the new child span re-parents to session root instead of the active tool span — flattens the trace tree.

Fix: mirror tracer.ts getParentContext(): prefer the active context if it has a span, only fall back to session root otherwise.

if (parentCtx) {
  ctx = trace.setSpan(otelContext.active(), parentCtx.span);
} else {
  const active = otelContext.active();
  ctx = trace.getSpan(active) ? active : (getSessionContext() ?? active);
}

2. tool.execution span misreports cancellation as success when tool returns ToolResult on abort (Codex P2)

Where: packages/core/src/core/coreToolScheduler.ts:2009-2011

Current code:

const toolResult: ToolResult = await promise;
endToolExecutionSpan(execSpan, {
  success: toolResult.error === undefined,  // true when tool resolves cleanly on abort
});
if (signal.aborted) {
  // marks parent tool span as cancelled, but execSpan is already closed as success
  setToolSpanCancelled(span);
}

Issue: if a tool detects signal.aborted and resolves with { llmContent: 'cancelled' } (no error field), the execution sub-span ends with success: true while the parent tool span ends as cancelled — inconsistent in trace.

Fix: include !signal.aborted in the metadata, or move endToolExecutionSpan after the abort check.

3. test mock missing API_CALL_ABORTED_SPAN_STATUS_MESSAGE (Codex P2)

Where: packages/core/src/core/loggingContentGenerator/loggingContentGenerator.test.ts:109-111

Issue: Production code now imports API_CALL_ABORTED_SPAN_STATUS_MESSAGE from ../../telemetry/tracer.js, but this Vitest mock only defines API_CALL_FAILED_SPAN_STATUS_MESSAGE. Vitest doesn't crash on missing exports (returns `undefined`), so existing tests pass, but the abort branch (when exercised in tests) would pass `undefined` as the error message — coverage blind spot.

Fix: add API_CALL_ABORTED_SPAN_STATUS_MESSAGE: 'API call aborted' to the mock.

4. idle timeout vs api log inconsistency (PR #4126 review by @wenshao)

Where: packages/core/src/core/loggingContentGenerator/loggingContentGenerator.ts:454

Issue: When the 5-min stream idle timeout fires, it ends the LLM span with `success: false` / timeout. If the consumer later drains buffered chunks, safelyLogApiResponse still logs a "successful" API response. The span says "timed out / error" while the API log says "success" — contradictory signals during incident response.

Fix: When spanEndedByTimeout is true, also skip the success-log branch in the normal completion path.

Out of scope (tracked elsewhere)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions