[SigEvents][Evals] Rename terminology for KI features and KI queries#258361
[SigEvents][Evals] Rename terminology for KI features and KI queries#258361viduni94 merged 4 commits intoelastic:mainfrom
Conversation
|
/ci |
📝 WalkthroughWalkthroughThis pull request systematically renames terminology across the significant events evaluation suite: "KI extraction" becomes "KI feature extraction," "KI duplication" becomes "KI feature duplication," and "rule generation" becomes "KI query generation." The refactor updates type names, function signatures, constants, variable names, file paths, and documentation throughout the codebase while preserving underlying logic. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Warning Tools execution failed with the following error: Failed to run tools: 13 INTERNAL: Received RST_STREAM with code 2 (Internal server error) Comment |
| import type { BoundInferenceClient } from '@kbn/inference-common'; | ||
| import { evaluate } from '../../../src/evaluate'; | ||
| import { KI_DUPLICATION_DATASETS } from './ki_duplication_datasets'; | ||
| import { KI_FEATURE_DUPLICATION_DATASETS } from './ki_feature_duplication_datasets'; |
There was a problem hiding this comment.
Side node: Is this evaluating duplication or deduplication ;-)
There was a problem hiding this comment.
It's evaluating duplication (by scoring uniqueness)
Correct me if I'm wrong @klacabane
There was a problem hiding this comment.
Yes we're evaluating duplication :)
| } from './sigevents_ki_features_index'; | ||
|
|
||
| describe('sigevents_kis_index', () => { | ||
| describe('sigevents_ki_features_index', () => { |
There was a problem hiding this comment.
Isn't this the streams_ki_features_index
There was a problem hiding this comment.
In here's it's not referring to the streams_ki_features_index.
It's referring to the snapshotable copy that the Significant Events snapshot workflow writes so it can be included in ES snapshots: sigevents-streams-features-<scenario>
ruflin
left a comment
There was a problem hiding this comment.
LGTM. Did skim through it and looks aligned.
I left some nits, don't hold back on this from getting it merged. We can keep iterating on it, it is not critical. Important is that the basics are aligned.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
x-pack/platform/packages/shared/kbn-evals-suite-streams/src/evaluators/ki_query_generation_evaluators.ts (1)
319-321:⚠️ Potential issue | 🟡 MinorResidual “generated rules” wording should be renamed to “generated queries.”
The success explanation still uses old terminology, which can confuse users and break string-based expectations around the KI query rename.
💡 Suggested patch
- : `All ${queries.length} generated rules passed code validation`, + : `All ${queries.length} generated queries passed code validation`,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@x-pack/platform/packages/shared/kbn-evals-suite-streams/src/evaluators/ki_query_generation_evaluators.ts` around lines 319 - 321, The success message still uses the old phrase "generated rules"; update the string in the ternary expression that builds the explanation (the branch that currently returns `All ${queries.length} generated rules passed code validation`) to use "generated queries" instead. Locate the construction that references issues, score, and queries (the expression using issues.join and `All ${queries.length} ...`) in the KI query generation evaluator and replace the wording accordingly, and run/update any tests or consumers that assert on that exact message literal.
🧹 Nitpick comments (1)
x-pack/platform/packages/shared/kbn-evals-suite-streams/src/evaluators/ki_feature_extraction_evaluators.test.ts (1)
134-134: Minor inconsistency: describe block name doesn't match new terminology.The describe block says
'ki count evaluator'but the evaluator is now named'ki_feature_count'. Consider updating for consistency:-describe('ki count evaluator', () => { +describe('ki feature count evaluator', () => {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@x-pack/platform/packages/shared/kbn-evals-suite-streams/src/evaluators/ki_feature_extraction_evaluators.test.ts` at line 134, Update the test suite description to match the evaluator's current name: change the describe block string from 'ki count evaluator' to 'ki_feature_count evaluator' (or similar consistent phrasing) so it aligns with the evaluator identifier ki_feature_count used in the tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In
`@x-pack/platform/packages/shared/kbn-evals-suite-streams/src/evaluators/ki_query_generation_evaluators.ts`:
- Around line 319-321: The success message still uses the old phrase "generated
rules"; update the string in the ternary expression that builds the explanation
(the branch that currently returns `All ${queries.length} generated rules passed
code validation`) to use "generated queries" instead. Locate the construction
that references issues, score, and queries (the expression using issues.join and
`All ${queries.length} ...`) in the KI query generation evaluator and replace
the wording accordingly, and run/update any tests or consumers that assert on
that exact message literal.
---
Nitpick comments:
In
`@x-pack/platform/packages/shared/kbn-evals-suite-streams/src/evaluators/ki_feature_extraction_evaluators.test.ts`:
- Line 134: Update the test suite description to match the evaluator's current
name: change the describe block string from 'ki count evaluator' to
'ki_feature_count evaluator' (or similar consistent phrasing) so it aligns with
the evaluator identifier ki_feature_count used in the tests.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: 65860eeb-44b2-428f-a124-df431f39f97f
📒 Files selected for processing (27)
x-pack/platform/packages/shared/kbn-evals-suite-streams/evals/significant_events/README.mdx-pack/platform/packages/shared/kbn-evals-suite-streams/evals/significant_events/datasets/index.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/evals/significant_events/datasets/otel_demo.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/evals/significant_events/datasets/types.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/evals/significant_events/ki_feature_duplication/ki_feature_duplication.spec.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/evals/significant_events/ki_feature_duplication/ki_feature_duplication_datasets.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/evals/significant_events/ki_feature_extraction/collect_sample_documents.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/evals/significant_events/ki_feature_extraction/ki_feature_extraction.spec.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/evals/significant_events/ki_query_generation/extract_log_text.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/evals/significant_events/ki_query_generation/get_computed_ki_features_from_docs.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/evals/significant_events/ki_query_generation/ki_query_generation.spec.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/evals/significant_events/ki_query_generation/resolve_ki_sources.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/lib/gcs.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/lib/significant_events_workflow.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/canonical_ki_features.test.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/canonical_ki_features.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/canonical_kis.test.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/load_ki_features_from_snapshot.test.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/load_ki_features_from_snapshot.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/replay.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/sigevents_ki_features_index.test.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/sigevents_ki_features_index.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/src/evaluators/ki_feature_duplication_evaluators.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/src/evaluators/ki_feature_extraction_evaluators.test.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/src/evaluators/ki_feature_extraction_evaluators.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/src/evaluators/ki_query_generation_evaluators.test.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/src/evaluators/ki_query_generation_evaluators.ts
💤 Files with no reviewable changes (1)
- x-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/canonical_kis.test.ts
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (4)
x-pack/platform/packages/shared/kbn-evals-suite-streams/src/evaluators/ki_query_generation_evaluators.test.ts (1)
30-33: Avoid the non-null assertion when selecting the evaluator.Line 33 uses
!, which makes failures less explicit if the evaluator name drifts. Prefer an explicit guard with a clear error.Proposed fix
const getKIQueryGenerationCodeEvaluator = (esClient: ElasticsearchClient) => - createKIQueryGenerationEvaluators(esClient).find( - (evaluator) => evaluator.name === 'ki_query_generation_code_evaluator' - )!; + { + const evaluator = createKIQueryGenerationEvaluators(esClient).find( + (item) => item.name === 'ki_query_generation_code_evaluator' + ); + if (!evaluator) { + throw new Error('ki_query_generation_code_evaluator was not registered'); + } + return evaluator; + };As per coding guidelines: "Avoid non-null assertions (
!) unless locally justified in TypeScript."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@x-pack/platform/packages/shared/kbn-evals-suite-streams/src/evaluators/ki_query_generation_evaluators.test.ts` around lines 30 - 33, The test uses a non-null assertion when selecting the evaluator which hides failures if the evaluator list changes; update getKIQueryGenerationCodeEvaluator to explicitly check the result of createKIQueryGenerationEvaluators(esClient).find(...) (looking for evaluator.name === 'ki_query_generation_code_evaluator') and throw a clear error (e.g., new Error with message identifying 'ki_query_generation_code_evaluator' not found) when the evaluator is undefined so failures are explicit and informative.x-pack/platform/packages/shared/kbn-evals-suite-streams/src/evaluators/ki_feature_extraction_evaluators.test.ts (1)
12-17: Create evaluators once and reuse for both lookups.You can avoid duplicate factory calls by creating one evaluator list and reusing it for
find().♻️ Suggested refactor
-const evidenceGroundingEvaluator = createKIFeatureExtractionEvaluators().find( +const kiFeatureExtractionEvaluators = createKIFeatureExtractionEvaluators(); + +const evidenceGroundingEvaluator = kiFeatureExtractionEvaluators.find( (evaluator) => evaluator.name === 'evidence_grounding' ); -const kiFeatureCountEvaluator = createKIFeatureExtractionEvaluators().find( +const kiFeatureCountEvaluator = kiFeatureExtractionEvaluators.find( (evaluator) => evaluator.name === 'ki_feature_count' );🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@x-pack/platform/packages/shared/kbn-evals-suite-streams/src/evaluators/ki_feature_extraction_evaluators.test.ts` around lines 12 - 17, The test calls createKIFeatureExtractionEvaluators() twice; instead, call it once, assign the returned array to a variable (e.g., evaluators), and then reuse that array for the two find() lookups to obtain evidenceGroundingEvaluator and kiFeatureCountEvaluator; update references to use the single evaluators variable and remove the duplicate factory invocation in createKIFeatureExtractionEvaluators().x-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/load_ki_features_from_snapshot.test.ts (1)
98-123: Assert the source snapshot index in this happy-path test.This only pins the temp rename target today. Because
allowNoMatches: true, a badgetSigeventsSnapshotKIFeaturesIndex()rename can quietly degrade into an empty restore and this test would still pass. Please also assert theindicesargument passed torestoreSnapshot.✅ Suggested test tightening
+import { getSigeventsSnapshotKIFeaturesIndex } from './sigevents_ki_features_index'; ... expect(mockRestoreSnapshot).toHaveBeenCalledWith( expect.objectContaining({ snapshotName: 'payment-unreachable', + indices: [getSigeventsSnapshotKIFeaturesIndex('payment-unreachable')], renamePattern: '(.+)', renameReplacement: KI_FEATURES_TEMP_INDEX, allowNoMatches: true, }) );🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@x-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/load_ki_features_from_snapshot.test.ts` around lines 98 - 123, The test currently only asserts the rename target for restoreSnapshot, leaving the source snapshot index unchecked and allowing misnamed source indices to slip by; add an assertion on mockRestoreSnapshot to include the indices field (the source snapshot index) — either compute the expected value by calling getSigeventsSnapshotKIFeaturesIndex('payment-unreachable') (or importing the source-index constant used by loadKIFeaturesFromSnapshot) and then assert expect(mockRestoreSnapshot).toHaveBeenCalledWith(expect.objectContaining({ indices: expectedIndex })), ensuring the restore call uses the correct source index.x-pack/platform/packages/shared/kbn-evals-suite-streams/src/evaluators/ki_feature_extraction_evaluators.ts (1)
449-476: Add an explicit return type to this exported factory.This public helper currently relies on a fairly wide inferred return type across the
basepath and the scenario-criteria path. Pinning it will make future refactors fail at the declaration instead of widening the evaluator API silently.As per coding guidelines, "Prefer explicit return types for public APIs and exported functions in TypeScript."✍️ Suggested tightening
export const createKIFeatureExtractionEvaluators = (scenarioCriteria?: { criteriaFn: (criteria: EvaluationCriterion[]) => Evaluator; criteria: EvaluationCriterion[]; -}) => { +}): ReadonlyArray<KIFeatureExtractionEvaluator> => {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@x-pack/platform/packages/shared/kbn-evals-suite-streams/src/evaluators/ki_feature_extraction_evaluators.ts` around lines 449 - 476, The exported factory createKIFeatureExtractionEvaluators currently relies on inference across the base (selectEvaluators) and scenario branch (createScenarioCriteriaLlmEvaluator) paths; add an explicit return type to the function signature (for example Evaluator[] or the more specific Evaluator<KIFeatureExtractionEvaluationExample, KIFeatureExtractionOutput>[] if you want to constrain elements) so the public API is pinned — update the signature of createKIFeatureExtractionEvaluators to include that return type and import the Evaluator type if necessary.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@x-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/load_ki_features_from_snapshot.ts`:
- Around line 16-17: The code uses a single global constant
KI_FEATURES_TEMP_INDEX ('sigevents-replay-temp-features') causing collisions
across concurrent runs; change to generate a unique temp index per invocation
(e.g., append a UUID or timestamp) and pass that generated name into the
restore, search, and cleanup flows instead of the global KI_FEATURES_TEMP_INDEX
constant—update all places that reference KI_FEATURES_TEMP_INDEX (the restore
call, search/scroll logic, and delete/cleanup logic mentioned in the comment
ranges) to accept and use a per-call tempIndex parameter so each run restores,
queries, and deletes its own index while keeping KI_FEATURES_SEARCH_LIMIT
unchanged.
---
Nitpick comments:
In
`@x-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/load_ki_features_from_snapshot.test.ts`:
- Around line 98-123: The test currently only asserts the rename target for
restoreSnapshot, leaving the source snapshot index unchecked and allowing
misnamed source indices to slip by; add an assertion on mockRestoreSnapshot to
include the indices field (the source snapshot index) — either compute the
expected value by calling
getSigeventsSnapshotKIFeaturesIndex('payment-unreachable') (or importing the
source-index constant used by loadKIFeaturesFromSnapshot) and then assert
expect(mockRestoreSnapshot).toHaveBeenCalledWith(expect.objectContaining({
indices: expectedIndex })), ensuring the restore call uses the correct source
index.
In
`@x-pack/platform/packages/shared/kbn-evals-suite-streams/src/evaluators/ki_feature_extraction_evaluators.test.ts`:
- Around line 12-17: The test calls createKIFeatureExtractionEvaluators() twice;
instead, call it once, assign the returned array to a variable (e.g.,
evaluators), and then reuse that array for the two find() lookups to obtain
evidenceGroundingEvaluator and kiFeatureCountEvaluator; update references to use
the single evaluators variable and remove the duplicate factory invocation in
createKIFeatureExtractionEvaluators().
In
`@x-pack/platform/packages/shared/kbn-evals-suite-streams/src/evaluators/ki_feature_extraction_evaluators.ts`:
- Around line 449-476: The exported factory createKIFeatureExtractionEvaluators
currently relies on inference across the base (selectEvaluators) and scenario
branch (createScenarioCriteriaLlmEvaluator) paths; add an explicit return type
to the function signature (for example Evaluator[] or the more specific
Evaluator<KIFeatureExtractionEvaluationExample, KIFeatureExtractionOutput>[] if
you want to constrain elements) so the public API is pinned — update the
signature of createKIFeatureExtractionEvaluators to include that return type and
import the Evaluator type if necessary.
In
`@x-pack/platform/packages/shared/kbn-evals-suite-streams/src/evaluators/ki_query_generation_evaluators.test.ts`:
- Around line 30-33: The test uses a non-null assertion when selecting the
evaluator which hides failures if the evaluator list changes; update
getKIQueryGenerationCodeEvaluator to explicitly check the result of
createKIQueryGenerationEvaluators(esClient).find(...) (looking for
evaluator.name === 'ki_query_generation_code_evaluator') and throw a clear error
(e.g., new Error with message identifying 'ki_query_generation_code_evaluator'
not found) when the evaluator is undefined so failures are explicit and
informative.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: 823eb02f-894c-4f51-94ba-166add805585
📒 Files selected for processing (27)
x-pack/platform/packages/shared/kbn-evals-suite-streams/evals/significant_events/README.mdx-pack/platform/packages/shared/kbn-evals-suite-streams/evals/significant_events/datasets/index.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/evals/significant_events/datasets/otel_demo.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/evals/significant_events/datasets/types.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/evals/significant_events/ki_feature_duplication/ki_feature_duplication.spec.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/evals/significant_events/ki_feature_duplication/ki_feature_duplication_datasets.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/evals/significant_events/ki_feature_extraction/collect_sample_documents.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/evals/significant_events/ki_feature_extraction/ki_feature_extraction.spec.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/evals/significant_events/ki_query_generation/extract_log_text.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/evals/significant_events/ki_query_generation/get_computed_ki_features_from_docs.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/evals/significant_events/ki_query_generation/ki_query_generation.spec.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/evals/significant_events/ki_query_generation/resolve_ki_sources.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/lib/gcs.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/scripts/significant_events_snapshots/lib/significant_events_workflow.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/canonical_ki_features.test.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/canonical_ki_features.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/canonical_kis.test.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/load_ki_features_from_snapshot.test.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/load_ki_features_from_snapshot.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/replay.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/sigevents_ki_features_index.test.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/sigevents_ki_features_index.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/src/evaluators/ki_feature_duplication_evaluators.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/src/evaluators/ki_feature_extraction_evaluators.test.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/src/evaluators/ki_feature_extraction_evaluators.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/src/evaluators/ki_query_generation_evaluators.test.tsx-pack/platform/packages/shared/kbn-evals-suite-streams/src/evaluators/ki_query_generation_evaluators.ts
💤 Files with no reviewable changes (1)
- x-pack/platform/packages/shared/kbn-evals-suite-streams/src/data_generators/canonical_kis.test.ts
...ackages/shared/kbn-evals-suite-streams/src/data_generators/load_ki_features_from_snapshot.ts
Outdated
Show resolved
Hide resolved
💚 Build Succeeded
Metrics [docs]
History
cc @viduni94 |
…d_agent_navigation2 * commit '9289d6b5502db245e645e190b0246554396c6c20': (34 commits) [api-docs] 2026-03-19 Daily api_docs build (elastic#258471) [Shared UX][DateRangePicker] Missing parts (elastic#258229) [Dashboard] Keep pinned_panels separate in read response (elastic#258444) Move inheritance: true to top level in .coderabbit.yml (elastic#258461) [DOCS] 9.3.2 Kibana release notes (elastic#257332) adds routing accept metric attribute to the cps metric (elastic#258168) [ML] AI/Inference Connector creation: use 'location' field to correctly set provider config (elastic#250838) [Lens] Add e2e test for legend list layout (elastic#258160) [SigEvents] Convert feature duplication evaluators to createPrompt pattern (elastic#256534) Add actionable-obs author to .coderabbit.yml (elastic#257922) [DOCS] 9.2.7 Kibana release notes (elastic#257331) Grant Serverless editor/viewer access to ES v2 indices (elastic#258384) [SigEvents][Evals] Rename terminology for KI features and KI queries (elastic#258361) [EDR Workflows][Osquery] Add shared table toolbar components and redesign saved queries list (elastic#258394) [Automatic Import V2] Upload samples using an existing index (elastic#258074) Add GET /inference_features route to expose feature registry (elastic#258044) fix additional fields not included (elastic#257625) [Discover] [Metrics] Add tier 2 journeys for Metrics in Discover E2E (elastic#255036) [Lens as code] Support correct X-Axis types in ES|QL visualizations (elastic#258159) Update APM (main) (elastic#254880) ...
Summary
Updated SigEvents terminology as below:
Checklist
release_note:*label is applied per the guidelinesbackport:*labels.Summary by CodeRabbit
Release Notes
Documentation
Refactor