Description
Running kbn-evals evaluation suites on main is currently broken due to two independent regressions. Both affect local and CI runs.
1. APM / OpenTelemetry tracing conflict
Introduced by: #258663 (OTel HTTP instrumentation)
After #258663, initTelemetry throws when Elastic APM and OpenTelemetry tracing are both active:
Error: Elastic APM and OpenTelemetry tracing cannot be enabled simultaneously.
To use OpenTelemetry tracing, disable APM by setting `elastic.apm.active: false` in your Kibana configuration.
The evals_tracing Scout config set enables OTel tracing but doesn't disable APM. Adding --elastic.apm.active=false as a CLI argument doesn't work because applyConfigOverrides stores CLI values as raw strings - 'false' (string) does not pass the apmConfig.active !== false (boolean) strict equality check in initTelemetry.
A secondary guard in ApmConfiguration.getConfigFromAllSources then requires contextPropagationOnly: false when APM is disabled, which also needs to be set explicitly.
This affects:
- The Scout Kibana server (started via
evals_tracing config set)
- The Playwright worker process (which loads APM via
require_init_apm.js and reads kibana.dev.yml directly)
2. Inference endpoint connector resolution error
Introduced by: #258530 (Consolidate LLM connector listing via inference plugin)
After #258530, getConnectorList() returns inference endpoint IDs (e.g.: .anthropic-claude-4.6-opus-chat_completion) instead of Kibana stack connector keys (e.g.: elastic-llm-claude-46-opus) for .inference-type connectors. kbn-evals passes the stack connector key to the inference API, which then resolves it to the inference endpoint ID via getConnectorById. However, the execution path in resolveAndCreatePipeline uses endpointIdCache.has(connectorId) to decide the executor - since the stack connector key is not in the cache, it falls through to the actions-based executor (createInferenceExecutor), which tries to load the inference endpoint ID as a Kibana saved object:
AxiosError: Saved object [action/.anthropic-claude-4.6-opus-chat_completion] not found
Reproduction
# Ensure EIS connectors are configured, then:
node scripts/evals start --suite streams/significant-events \
--project eis-anthropic-claude-4-6-opus \
--judge eis-google-gemini-3-1-pro
Both errors surface immediately - the tracing conflict crashes Kibana/worker on startup, and the connector error fails on the first inference call.
Expected behavior
Evals should run end-to-end against EIS-backed models with OTel tracing enabled, both locally and in CI.
Description
Running
kbn-evalsevaluation suites onmainis currently broken due to two independent regressions. Both affect local and CI runs.1. APM / OpenTelemetry tracing conflict
Introduced by: #258663 (OTel HTTP instrumentation)
After #258663,
initTelemetrythrows when Elastic APM and OpenTelemetry tracing are both active:The
evals_tracingScout config set enables OTel tracing but doesn't disable APM. Adding--elastic.apm.active=falseas a CLI argument doesn't work becauseapplyConfigOverridesstores CLI values as raw strings - 'false' (string) does not pass theapmConfig.active !== false(boolean) strict equality check in initTelemetry.A secondary guard in
ApmConfiguration.getConfigFromAllSourcesthen requirescontextPropagationOnly: falsewhen APM is disabled, which also needs to be set explicitly.This affects:
evals_tracingconfig set)require_init_apm.jsand readskibana.dev.ymldirectly)2. Inference endpoint connector resolution error
Introduced by: #258530 (Consolidate LLM connector listing via inference plugin)
After #258530,
getConnectorList()returns inference endpoint IDs (e.g.:.anthropic-claude-4.6-opus-chat_completion) instead of Kibana stack connector keys (e.g.:elastic-llm-claude-46-opus) for.inference-typeconnectors.kbn-evalspasses the stack connector key to the inference API, which then resolves it to the inference endpoint ID viagetConnectorById. However, the execution path inresolveAndCreatePipelineusesendpointIdCache.has(connectorId)to decide the executor - since the stack connector key is not in the cache, it falls through to the actions-based executor (createInferenceExecutor), which tries to load the inference endpoint ID as a Kibana saved object:Reproduction
Both errors surface immediately - the tracing conflict crashes Kibana/worker on startup, and the connector error fails on the first inference call.
Expected behavior
Evals should run end-to-end against EIS-backed models with OTel tracing enabled, both locally and in CI.