Skip to content

[kbn-evals] Evals broken on main: APM/OTel tracing conflict and inference endpoint connector resolution #259472

@viduni94

Description

@viduni94

Description

Running kbn-evals evaluation suites on main is currently broken due to two independent regressions. Both affect local and CI runs.

1. APM / OpenTelemetry tracing conflict

Introduced by: #258663 (OTel HTTP instrumentation)

After #258663, initTelemetry throws when Elastic APM and OpenTelemetry tracing are both active:

Error: Elastic APM and OpenTelemetry tracing cannot be enabled simultaneously.
To use OpenTelemetry tracing, disable APM by setting `elastic.apm.active: false` in your Kibana configuration.

The evals_tracing Scout config set enables OTel tracing but doesn't disable APM. Adding --elastic.apm.active=false as a CLI argument doesn't work because applyConfigOverrides stores CLI values as raw strings - 'false' (string) does not pass the apmConfig.active !== false (boolean) strict equality check in initTelemetry.

A secondary guard in ApmConfiguration.getConfigFromAllSources then requires contextPropagationOnly: false when APM is disabled, which also needs to be set explicitly.

This affects:

  • The Scout Kibana server (started via evals_tracing config set)
  • The Playwright worker process (which loads APM via require_init_apm.js and reads kibana.dev.yml directly)

2. Inference endpoint connector resolution error

Introduced by: #258530 (Consolidate LLM connector listing via inference plugin)

After #258530, getConnectorList() returns inference endpoint IDs (e.g.: .anthropic-claude-4.6-opus-chat_completion) instead of Kibana stack connector keys (e.g.: elastic-llm-claude-46-opus) for .inference-type connectors. kbn-evals passes the stack connector key to the inference API, which then resolves it to the inference endpoint ID via getConnectorById. However, the execution path in resolveAndCreatePipeline uses endpointIdCache.has(connectorId) to decide the executor - since the stack connector key is not in the cache, it falls through to the actions-based executor (createInferenceExecutor), which tries to load the inference endpoint ID as a Kibana saved object:

AxiosError: Saved object [action/.anthropic-claude-4.6-opus-chat_completion] not found

Reproduction

# Ensure EIS connectors are configured, then:
node scripts/evals start --suite streams/significant-events \
  --project eis-anthropic-claude-4-6-opus \
  --judge eis-google-gemini-3-1-pro

Both errors surface immediately - the tracing conflict crashes Kibana/worker on startup, and the connector error fails on the first inference call.

Expected behavior

Evals should run end-to-end against EIS-backed models with OTel tracing enabled, both locally and in CI.

Metadata

Metadata

Assignees

Labels

Team:obs-aiObservability AI teamkbn-evalsIssue related to the work on Kibana's LLM evaluation framework.

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions