follow-up(telemetry): Phase 1.5 polish — fallback order, exec span on abort-as-result, mock + log/span consistency

Follow-up to #4126 (Phase 1: unify span creation paths). PR #4126 is merged; these are non-blocking polish items found in the final review rounds.

## Items

### 1. `startLLMRequestSpan`/`startToolSpan` fallback order (Codex P2)

**Where:** `packages/core/src/telemetry/session-tracing.ts:204-206` (and parallel code in `startToolSpan`)

**Current behavior:** when `interactionContext.getStore()` is empty:
```ts
const ctx = parentCtx
  ? trace.setSpan(otelContext.active(), parentCtx.span)
  : (getSessionContext() ?? otelContext.active());
```

**Issue:** falls back to synthetic session root BEFORE checking if `otelContext.active()` already has an active span. If an LLM/tool call happens inside a tool span (subagent inside a tool, or any nested LLM-inside-tool path) but outside an interaction span, the new child span re-parents to session root instead of the active tool span — flattens the trace tree.

**Fix:** mirror `tracer.ts` `getParentContext()`: prefer the active context if it has a span, only fall back to session root otherwise.
```ts
if (parentCtx) {
  ctx = trace.setSpan(otelContext.active(), parentCtx.span);
} else {
  const active = otelContext.active();
  ctx = trace.getSpan(active) ? active : (getSessionContext() ?? active);
}
```

### 2. tool.execution span misreports cancellation as success when tool returns ToolResult on abort (Codex P2)

**Where:** `packages/core/src/core/coreToolScheduler.ts:2009-2011`

**Current code:**
```ts
const toolResult: ToolResult = await promise;
endToolExecutionSpan(execSpan, {
  success: toolResult.error === undefined,  // true when tool resolves cleanly on abort
});
if (signal.aborted) {
  // marks parent tool span as cancelled, but execSpan is already closed as success
  setToolSpanCancelled(span);
}
```

**Issue:** if a tool detects `signal.aborted` and resolves with `{ llmContent: 'cancelled' }` (no `error` field), the execution sub-span ends with `success: true` while the parent tool span ends as cancelled — inconsistent in trace.

**Fix:** include `!signal.aborted` in the metadata, or move `endToolExecutionSpan` after the abort check.

### 3. test mock missing `API_CALL_ABORTED_SPAN_STATUS_MESSAGE` (Codex P2)

**Where:** `packages/core/src/core/loggingContentGenerator/loggingContentGenerator.test.ts:109-111`

**Issue:** Production code now imports `API_CALL_ABORTED_SPAN_STATUS_MESSAGE` from `../../telemetry/tracer.js`, but this Vitest mock only defines `API_CALL_FAILED_SPAN_STATUS_MESSAGE`. Vitest doesn't crash on missing exports (returns \`undefined\`), so existing tests pass, but the abort branch (when exercised in tests) would pass \`undefined\` as the error message — coverage blind spot.

**Fix:** add `API_CALL_ABORTED_SPAN_STATUS_MESSAGE: 'API call aborted'` to the mock.

### 4. idle timeout vs api log inconsistency (PR #4126 review by @wenshao)

**Where:** `packages/core/src/core/loggingContentGenerator/loggingContentGenerator.ts:454`

**Issue:** When the 5-min stream idle timeout fires, it ends the LLM span with \`success: false\` / timeout. If the consumer later drains buffered chunks, `safelyLogApiResponse` still logs a "successful" API response. The span says "timed out / error" while the API log says "success" — contradictory signals during incident response.

**Fix:** When `spanEndedByTimeout` is true, also skip the success-log branch in the normal completion path.

## Out of scope (tracked elsewhere)

- `tool.blocked_on_user` span — Phase 2 in #3731
- `hook` span — Phase 2 in #3731
- Subagent trace tree — Phase 3 in #3731
- Detailed span content attributes — see PR #4097

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

follow-up(telemetry): Phase 1.5 polish — fallback order, exec span on abort-as-result, mock + log/span consistency #4212

Items

1. `startLLMRequestSpan`/`startToolSpan` fallback order (Codex P2)

2. tool.execution span misreports cancellation as success when tool returns ToolResult on abort (Codex P2)

3. test mock missing `API_CALL_ABORTED_SPAN_STATUS_MESSAGE` (Codex P2)

4. idle timeout vs api log inconsistency (PR #4126 review by @wenshao)

Out of scope (tracked elsewhere)

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

follow-up(telemetry): Phase 1.5 polish — fallback order, exec span on abort-as-result, mock + log/span consistency #4212

Description

Items

1. startLLMRequestSpan/startToolSpan fallback order (Codex P2)

2. tool.execution span misreports cancellation as success when tool returns ToolResult on abort (Codex P2)

3. test mock missing API_CALL_ABORTED_SPAN_STATUS_MESSAGE (Codex P2)

4. idle timeout vs api log inconsistency (PR #4126 review by @wenshao)

Out of scope (tracked elsewhere)

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

1. `startLLMRequestSpan`/`startToolSpan` fallback order (Codex P2)

3. test mock missing `API_CALL_ABORTED_SPAN_STATUS_MESSAGE` (Codex P2)