[Streams] Add simulation filtering by conditions by mykolaharmash · Pull Request #245400 · elastic/kibana

mykolaharmash · 2025-12-05T15:04:22Z

Summary

This change adds an option to focus the simulation on a specific WHERE condition by clicking on it or using the condition's context menu.

CleanShot.2025-12-16.at.12.40.18.mp4

Key implementation details

_simulate endpoint was modified to return a new processed_by property on every document that went through simulation. This property holds a list of processor IDs that affected a specific document.
Using the new processed_by property, client code is able to filter documents in the table when user clicks on a condition by collecting all processors that are included in the selected condition (and implicitly previous processors as they could be affecting the partial simulation) and mapping them to the documents with processed_by.
The documents are filtered on the client side which give a nice user experience of an immediate feedback. Additionally, a partial simulation is executed in the background in order to update the relative percentage numbers for the table filters and the step list.
A baseSimulation property was added to the simulation state machine context in order to hold the full simulation containing all sample documents. This is needed for cases when user switches between conditions without clearing the previously condition selection, each condition needs to have access to the full simulation in order to see full documents list with processed_by properties.

Important

Using the described approach with the processed_by property gives a few key benefits:

It enables the client side filtering with immediate feedback
It avoids complicating the _simulate endpoint logic and keeps it almost unchanged. If we would go with server side filtering, _simulate endpoint would need to be aware of the concept of filtering by condition in order to properly calculate the metrics for only the subset of the documents.

At the same time, because we have no way to faithfully evaluate WHERE conditions without running some special conditions-only simulation, filtering of the documents is based on the processors (both included in the target condition and implicitly processors from the previous steps) and not strictly on the condition expression. This might produce unexpected results in terms of visible documents for cases when, for example, processors themselves include a condition which makes them hidden from the table even though user selected a condition that matches those documents. While technically correct, users might have questions. For most users though, I expect the current UX to be transparent enough.

How to test

Run Kibana with some data flowing into Streams
Create a processing steps with some conditions
Click on one of the conditions or use "Preview this only" item inside its context menu
Check that documents left in the table correspond to the processors you have inside the condition
Check that percentages displayed on the condition's steps and on the table filters are correct

elasticmachine · 2025-12-10T12:14:11Z

Pinging @elastic/obs-onboarding-team (Team:obs-onboarding)

mykolaharmash · 2025-12-10T12:14:19Z

/ci

…ure-implement-interactive-clickable-filtering-for-conditions

mykolaharmash · 2025-12-11T09:35:44Z

I've made a couple of changes after our discussion during the weekly. The partial simulation after selection a condition now receives only documents directly related to the target condition while steps are still collected from the condition itself an all previous steps.

So when the condition is selected, the table now shows only relevant documents and processor metrics are isolated to that document set which makes it way more intuitive to use.

CleanShot.2025-12-11.at.10.30.57.mp4

…ure-implement-interactive-clickable-filtering-for-conditions

mykolaharmash · 2025-12-16T10:39:32Z

I've adapted the condition logic to the new state machines structure after the changes made for YAML mode, added a hover state for eligible conditions and fix a couple bugs found by Luca.

Kerry350

Looks good for the most part, nice work👏

Some bugs 🐛

Clicking the "x" on the "Selected" button in YAML mode doesn't do anything:

Given the following:

A condition step, with one nested processor.
Clicking the condition to select it.
Moving to YAML mode.
Moving back to interactive mode.
Adding a new top level processor.

The new processor is added to the steps for simulation, e.g.:

Expected behaviour:

YAML mode is not honouring the selected condition when it sends it's steps for simulation. There should be feature parity between the two.
On the whole it seems to be able to get into states that are just incorrect, e.g. here steps are being simulated that shouldn't be, and I see selected 100% which shouldn't be true.

Enhancements:

The ability to filter by condition could be added to YAML mode alongside the "Run up to step" button:

Bugs? 🐛

These ones are "are they bugs or not?"

We're not representing the new information, e.g. the new status (excluded due to condition etc) in YAML mode, these exist in the gutter markers and so on.
I'm also concerned a little by drift on the IDs, just thinking aloud here. When in YAML mode and content is edited the steps get assigned deterministic IDs based on their position in the step hierarchy.

Run simulation
└── processed_by: ['step_abc123', 'step_def456']
Click condition (baseSimulation frozen)
└── baseSimulation.processed_by still references 'step_abc123', etc.
Edit YAML / regenerate steps
└── New IDs: 'step_xyz789', 'step_uvw012'
└── But baseSimulation still has old IDs
Try to filter by condition
└── Looks for 'step_xyz789' in processed_by
└── Finds nothing 💥 (I think)

Kerry350 · 2025-12-16T14:29:02Z

...anagement/stream_detail_enrichment/state_management/stream_enrichment_state_machine/utils.ts

 /* Find insert index based on step hierarchy */
 export function findInsertIndex(stepRefs: StepActorRef[], parentId: string | null): number {
  // Find the index of the parent step
+  // debugger;


Can be removed.

Kerry350 · 2025-12-16T14:30:25Z

...ams_app/public/components/data_management/stream_detail_enrichment/state_management/utils.ts

+ * Recursively collects all descendant step IDs
+ * for a given parent step ID.
+ */
+export function collectDescendantStepIds(


There's quite a few instances of this, can we unify them?

couvq

Great work!

non-blocking, general ux comment: To me it wasn't obvious at first that the preceding steps were enabled by the simulation filtering until I played around with the UI a bit, I'm wondering if it would make sense to also highlight those steps as well? @patpascal

…ure-implement-interactive-clickable-filtering-for-conditions

mykolaharmash · 2025-12-17T09:36:13Z

Great work!

non-blocking, general ux comment: To me it wasn't obvious at first that the preceding steps were enabled by the simulation filtering until I played around with the UI a bit, I'm wondering if it would make sense to also highlight those steps as well? @patpascal

Thank you Quentin, that's a fair point, was also pointed out in the original issue. The decision was to go with this UI (highlighting only the directly selected condition) to simplify the UX. Highlighting all parent conditions when user clicks on one of the children might also be confusing, though in the future we might consider something like highlighting the connecting lines instead of whole blocks.

mykolaharmash · 2025-12-17T09:42:51Z

@Kerry350 Thank you for the review, I obviously didn't test it thoroughly with YAML mode, sorry about that 🙈 I talked with @LucaWintergerst and we agreed to go without filtering in YAML mode in order to not delay this feature as it would require more work both in the code and on the UX side. It's a bit unfortunate that we didn't plan for YAML mode support beforehand and got kind of blindsided.

I've added a logic to reset all filtering when user switches into YAML mode that should resolve the issues you've mentioned. Please take another look when have time 🙌

elasticmachine · 2025-12-17T11:39:58Z

💛 Build succeeded, but was flaky

Buildkite Build
Commit: d0caa48

Failed CI Steps

FTR Configs #60

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id	before	after	diff
`streamsApp`	1420	1422	+2

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id	before	after	diff
`@kbn/streams-schema`	225	226	+1

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id	before	after	diff
`streamsApp`	1.5MB	1.5MB	+6.0KB

Unknown metric groups

API count

id	before	after	diff
`@kbn/streams-schema`	232	233	+1

History

💚 Build #373785 succeeded a0f56fb
💔 Build #373743 failed bbc92cc
💔 Build #372500 failed 50dbc7c
💔 Build #371619 failed 6089f03
💔 Build #371094 failed 39ed6e9
💔 Build #371034 failed b8bcdc9

Kerry350

LGTM based on the fact we're only supporting this in interactive mode.

…donly * commit 'bb1f55fa520b30ceb923af069ef403b24dcb1606': (52 commits) [CPS][Maps] Support CPS Picker in Maps (elastic#246382) [APM] Migrate the Transaction Overview tests to Scout/Playwright/Component/API tests (elastic#245972) [Cases] Change nested field search to be case insensitive (elastic#246643) [ES|QL] PromQL parser initial implementation (elastic#246552) [Agent Builder] Adds keyboard shortcut and toggle behavior to AI Agent button (elastic#246659) Retry on "all shards failed" from ES (elastic#246533) [Streams] Test enable wired streams flow (elastic#246113) [Agent Builder] Fast-follow bugfixes for MCP Tool type (elastic#246665) [Entity Store][API] Fix snake case on CRUD API List response (elastic#246003) [ResponseOps][Slack] Simplify channel configuration (elastic#245423) Add Canonical Name Badge to Documentation (elastic#246647) [Streams] Add simulation filtering by conditions (elastic#245400) [o11y AI] Add `get_hosts` tool (elastic#246541) [agent builder] create_visualization: support heatmap and regionmap (elastic#246671) [AI Infra] Chat experience: Selection modal title change (elastic#246683) [Background search] Change polling behavior (elastic#244760) [ES|QL ] Common Lookup Join Fields Are Not Listed First (elastic#246582) Add missing `dynamic: false` (elastic#246685) [Metrics in Discover] Unskip metrics api test (elastic#246593) [ES|QL] Show next actions after simple field assignment in RERANK ON Clause (elastic#246676) ...

Closes elastic#239909 🔒 [Figma](https://www.figma.com/design/C4o0y9Knk4jYXjsulvbW3w/Streams-Processors--Grok?m=auto&node-id=9552-33019&t=8ZoAb4lgexLLXxBg-1) ## Summary This change adds an option to focus the simulation on a specific WHERE condition by clicking on it or using the condition's context menu. https://github.com/user-attachments/assets/6e443865-a679-46e1-b2cd-fec01a113655 ## Key implementation details * `_simulate` endpoint was modified to return a new `processed_by` property on every document that went through simulation. This property holds a list of processor IDs that affected a specific document. * Using the new `processed_by` property, client code is able to filter documents in the table when user clicks on a condition by collecting all processors that are included in the selected condition (and implicitly previous processors as they could be affecting the partial simulation) and mapping them to the documents with `processed_by`. * The documents are filtered on the client side which give a nice user experience of an immediate feedback. Additionally, a partial simulation is executed in the background in order to update the relative percentage numbers for the table filters and the step list. * A `baseSimulation` property was added to the simulation state machine context in order to hold the full simulation containing all sample documents. This is needed for cases when user switches between conditions without clearing the previously condition selection, each condition needs to have access to the full simulation in order to see full documents list with `processed_by` properties. > [!IMPORTANT] > Using the described approach with the `processed_by` property gives a few key benefits: > * It enables the client side filtering with immediate feedback > * It avoids complicating the `_simulate` endpoint logic and keeps it almost unchanged. If we would go with server side filtering, `_simulate` endpoint would need to be aware of the concept of filtering by condition in order to properly calculate the metrics for only the subset of the documents. > > At the same time, because we have no way to faithfully evaluate WHERE conditions without running some special conditions-only simulation, filtering of the documents is based on the processors (both included in the target condition and implicitly processors from the previous steps) and not strictly on the condition expression. This might produce unexpected results in terms of visible documents for cases when, for example, processors themselves include a condition which makes them hidden from the table even though user selected a condition that matches those documents. While technically correct, users might have questions. For most users though, I expect the current UX to be transparent enough. ## How to test 1. Run Kibana with some data flowing into Streams 2. Create a processing steps with some conditions 3. Click on one of the conditions or use "Preview this only" item inside its context menu 4. Check that documents left in the table correspond to the processors you have inside the condition 5. Check that percentages displayed on the condition's steps and on the table filters are correct

mykolaharmash force-pushed the 239909-streams-feature-implement-interactive-clickable-filtering-for-conditions branch from 8506adc to 8d73b2b Compare December 10, 2025 09:05

[Streams] Add simulation filtering by consitions

b8bcdc9

mykolaharmash force-pushed the 239909-streams-feature-implement-interactive-clickable-filtering-for-conditions branch from 8d73b2b to b8bcdc9 Compare December 10, 2025 11:10

mykolaharmash marked this pull request as ready for review December 10, 2025 12:13

mykolaharmash requested a review from a team as a code owner December 10, 2025 12:13

mykolaharmash added release_note:skip Skip the PR/issue when compiling release notes backport:skip This PR does not require backporting Team:obs-onboarding Observability Onboarding Team labels Dec 10, 2025

Add simulation utils tests

39ed6e9

tonyghiani changed the title ~~[Streams] Add simulation filtering by consitions~~ [Streams] Add simulation filtering by conditions Dec 10, 2025

Send only directly dependant documents for partial simulation

b7d7ab5

mykolaharmash requested a review from a team as a code owner December 11, 2025 08:37

mykolaharmash added 2 commits December 11, 2025 10:18

Replace a condition click container with a background button

5f5e77f

Merge remote-tracking branch 'upstream/main' into 239909-streams-feat…

6089f03

…ure-implement-interactive-clickable-filtering-for-conditions

mykolaharmash added 7 commits December 11, 2025 11:13

Fix types

43dba81

Code clean up

440b5c2

Handle selected consition deletion + auto-filtering

4279486

Reset condition on processor save

50dbc7c

Merge remote-tracking branch 'upstream/main' into 239909-streams-feat…

67ebd92

…ure-implement-interactive-clickable-filtering-for-conditions

Adopt condition logic to the new state machines structure

e1dac01

Merge remote-tracking branch 'upstream/main' into 239909-streams-feat…

8d256c2

…ure-implement-interactive-clickable-filtering-for-conditions

mykolaharmash added 2 commits December 16, 2025 12:03

Clean up

bbc92cc

Fix types

a0f56fb

Kerry350 self-requested a review December 16, 2025 12:27

Kerry350 reviewed Dec 16, 2025

View reviewed changes

couvq reviewed Dec 16, 2025

View reviewed changes

mykolaharmash added 3 commits December 17, 2025 09:51

Reset condition state before switching into YAML mode

1d3ffb4

Merge remote-tracking branch 'upstream/main' into 239909-streams-feat…

ee7db33

…ure-implement-interactive-clickable-filtering-for-conditions

Add unified collectDescendantStepIds() util

d0caa48

mykolaharmash requested a review from Kerry350 December 17, 2025 09:31

Kerry350 approved these changes Dec 17, 2025

View reviewed changes

mykolaharmash merged commit 671898a into elastic:main Dec 17, 2025
13 checks passed

kibanamachine added the v9.3.0 label Dec 17, 2025

mdbirnstiehl mentioned this pull request Dec 22, 2025

[Streams] [META] Processing updates for 9.3 elastic/docs-content#4441

Closed

Conversation

mykolaharmash commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key implementation details

How to test

Uh oh!

elasticmachine commented Dec 10, 2025

Uh oh!

mykolaharmash commented Dec 10, 2025

Uh oh!

mykolaharmash commented Dec 11, 2025

Uh oh!

mykolaharmash commented Dec 16, 2025

Uh oh!

Kerry350 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Kerry350 Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

Kerry350 Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

couvq left a comment

Choose a reason for hiding this comment

Uh oh!

mykolaharmash commented Dec 17, 2025

Uh oh!

mykolaharmash commented Dec 17, 2025

Uh oh!

elasticmachine commented Dec 17, 2025

💛 Build succeeded, but was flaky

Failed CI Steps

Metrics [docs]

Module Count

Public APIs missing comments

Async chunks

API count

History

Uh oh!

Kerry350 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

mykolaharmash commented Dec 5, 2025 •

edited

Loading

Kerry350 left a comment •

edited

Loading