Skip to content

[Streams] Add simulation filtering by conditions#245400

Merged
mykolaharmash merged 17 commits intoelastic:mainfrom
mykolaharmash:239909-streams-feature-implement-interactive-clickable-filtering-for-conditions
Dec 17, 2025
Merged

[Streams] Add simulation filtering by conditions#245400
mykolaharmash merged 17 commits intoelastic:mainfrom
mykolaharmash:239909-streams-feature-implement-interactive-clickable-filtering-for-conditions

Conversation

@mykolaharmash
Copy link
Copy Markdown
Contributor

@mykolaharmash mykolaharmash commented Dec 5, 2025

Closes #239909

🔒 Figma

Summary

This change adds an option to focus the simulation on a specific WHERE condition by clicking on it or using the condition's context menu.

CleanShot.2025-12-16.at.12.40.18.mp4

Key implementation details

  • _simulate endpoint was modified to return a new processed_by property on every document that went through simulation. This property holds a list of processor IDs that affected a specific document.
  • Using the new processed_by property, client code is able to filter documents in the table when user clicks on a condition by collecting all processors that are included in the selected condition (and implicitly previous processors as they could be affecting the partial simulation) and mapping them to the documents with processed_by.
  • The documents are filtered on the client side which give a nice user experience of an immediate feedback. Additionally, a partial simulation is executed in the background in order to update the relative percentage numbers for the table filters and the step list.
  • A baseSimulation property was added to the simulation state machine context in order to hold the full simulation containing all sample documents. This is needed for cases when user switches between conditions without clearing the previously condition selection, each condition needs to have access to the full simulation in order to see full documents list with processed_by properties.

Important

Using the described approach with the processed_by property gives a few key benefits:

  • It enables the client side filtering with immediate feedback
  • It avoids complicating the _simulate endpoint logic and keeps it almost unchanged. If we would go with server side filtering, _simulate endpoint would need to be aware of the concept of filtering by condition in order to properly calculate the metrics for only the subset of the documents.

At the same time, because we have no way to faithfully evaluate WHERE conditions without running some special conditions-only simulation, filtering of the documents is based on the processors (both included in the target condition and implicitly processors from the previous steps) and not strictly on the condition expression. This might produce unexpected results in terms of visible documents for cases when, for example, processors themselves include a condition which makes them hidden from the table even though user selected a condition that matches those documents. While technically correct, users might have questions. For most users though, I expect the current UX to be transparent enough.

How to test

  1. Run Kibana with some data flowing into Streams
  2. Create a processing steps with some conditions
  3. Click on one of the conditions or use "Preview this only" item inside its context menu
  4. Check that documents left in the table correspond to the processors you have inside the condition
  5. Check that percentages displayed on the condition's steps and on the table filters are correct

@mykolaharmash mykolaharmash force-pushed the 239909-streams-feature-implement-interactive-clickable-filtering-for-conditions branch from 8506adc to 8d73b2b Compare December 10, 2025 09:05
@mykolaharmash mykolaharmash force-pushed the 239909-streams-feature-implement-interactive-clickable-filtering-for-conditions branch from 8d73b2b to b8bcdc9 Compare December 10, 2025 11:10
@mykolaharmash mykolaharmash marked this pull request as ready for review December 10, 2025 12:13
@mykolaharmash mykolaharmash requested a review from a team as a code owner December 10, 2025 12:13
@mykolaharmash mykolaharmash added release_note:skip Skip the PR/issue when compiling release notes backport:skip This PR does not require backporting Team:obs-onboarding Observability Onboarding Team labels Dec 10, 2025
@elasticmachine
Copy link
Copy Markdown
Contributor

Pinging @elastic/obs-onboarding-team (Team:obs-onboarding)

@mykolaharmash
Copy link
Copy Markdown
Contributor Author

/ci

@tonyghiani tonyghiani changed the title [Streams] Add simulation filtering by consitions [Streams] Add simulation filtering by conditions Dec 10, 2025
@mykolaharmash mykolaharmash requested a review from a team as a code owner December 11, 2025 08:37
@mykolaharmash
Copy link
Copy Markdown
Contributor Author

I've made a couple of changes after our discussion during the weekly. The partial simulation after selection a condition now receives only documents directly related to the target condition while steps are still collected from the condition itself an all previous steps.

So when the condition is selected, the table now shows only relevant documents and processor metrics are isolated to that document set which makes it way more intuitive to use.

CleanShot.2025-12-11.at.10.30.57.mp4

@mykolaharmash
Copy link
Copy Markdown
Contributor Author

I've adapted the condition logic to the new state machines structure after the changes made for YAML mode, added a hover state for eligible conditions and fix a couple bugs found by Luca.

@Kerry350 Kerry350 self-requested a review December 16, 2025 12:27
Copy link
Copy Markdown
Contributor

@Kerry350 Kerry350 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good for the most part, nice work👏

Some bugs 🐛

Clicking the "x" on the "Selected" button in YAML mode doesn't do anything:

Screenshot 2025-12-16 at 14 21 19

Given the following:

  • A condition step, with one nested processor.
  • Clicking the condition to select it.
  • Moving to YAML mode.
  • Moving back to interactive mode.
  • Adding a new top level processor.

The new processor is added to the steps for simulation, e.g.:

Screenshot 2025-12-16 at 14 39 19

Expected behaviour:

Screenshot 2025-12-16 at 14 39 55
  • YAML mode is not honouring the selected condition when it sends it's steps for simulation. There should be feature parity between the two.

  • On the whole it seems to be able to get into states that are just incorrect, e.g. here steps are being simulated that shouldn't be, and I see selected 100% which shouldn't be true.

Screenshot 2025-12-16 at 14 46 55

Enhancements:

The ability to filter by condition could be added to YAML mode alongside the "Run up to step" button:

Screenshot 2025-12-16 at 14 20 21

Bugs? 🐛

These ones are "are they bugs or not?"

  • We're not representing the new information, e.g. the new status (excluded due to condition etc) in YAML mode, these exist in the gutter markers and so on.

  • I'm also concerned a little by drift on the IDs, just thinking aloud here. When in YAML mode and content is edited the steps get assigned deterministic IDs based on their position in the step hierarchy.

  1. Run simulation
    └── processed_by: ['step_abc123', 'step_def456']

  2. Click condition (baseSimulation frozen)
    └── baseSimulation.processed_by still references 'step_abc123', etc.

  3. Edit YAML / regenerate steps
    └── New IDs: 'step_xyz789', 'step_uvw012'
    └── But baseSimulation still has old IDs

  4. Try to filter by condition
    └── Looks for 'step_xyz789' in processed_by
    └── Finds nothing 💥 (I think)

/* Find insert index based on step hierarchy */
export function findInsertIndex(stepRefs: StepActorRef[], parentId: string | null): number {
// Find the index of the parent step
// debugger;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be removed.

* Recursively collects all descendant step IDs
* for a given parent step ID.
*/
export function collectDescendantStepIds(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's quite a few instances of this, can we unify them?

Copy link
Copy Markdown
Contributor

@couvq couvq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!

non-blocking, general ux comment: To me it wasn't obvious at first that the preceding steps were enabled by the simulation filtering until I played around with the UI a bit, I'm wondering if it would make sense to also highlight those steps as well? @patpascal

@mykolaharmash
Copy link
Copy Markdown
Contributor Author

Great work!

non-blocking, general ux comment: To me it wasn't obvious at first that the preceding steps were enabled by the simulation filtering until I played around with the UI a bit, I'm wondering if it would make sense to also highlight those steps as well? @patpascal

Thank you Quentin, that's a fair point, was also pointed out in the original issue. The decision was to go with this UI (highlighting only the directly selected condition) to simplify the UX. Highlighting all parent conditions when user clicks on one of the children might also be confusing, though in the future we might consider something like highlighting the connecting lines instead of whole blocks.

@mykolaharmash
Copy link
Copy Markdown
Contributor Author

@Kerry350 Thank you for the review, I obviously didn't test it thoroughly with YAML mode, sorry about that 🙈 I talked with @LucaWintergerst and we agreed to go without filtering in YAML mode in order to not delay this feature as it would require more work both in the code and on the UX side. It's a bit unfortunate that we didn't plan for YAML mode support beforehand and got kind of blindsided.

I've added a logic to reset all filtering when user switches into YAML mode that should resolve the issues you've mentioned. Please take another look when have time 🙌

@elasticmachine
Copy link
Copy Markdown
Contributor

💛 Build succeeded, but was flaky

Failed CI Steps

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
streamsApp 1420 1422 +2

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
@kbn/streams-schema 225 226 +1

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
streamsApp 1.5MB 1.5MB +6.0KB
Unknown metric groups

API count

id before after diff
@kbn/streams-schema 232 233 +1

History

Copy link
Copy Markdown
Contributor

@Kerry350 Kerry350 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM based on the fact we're only supporting this in interactive mode.

@mykolaharmash mykolaharmash merged commit 671898a into elastic:main Dec 17, 2025
13 checks passed
mbondyra added a commit to mbondyra/kibana that referenced this pull request Dec 17, 2025
…donly

* commit 'bb1f55fa520b30ceb923af069ef403b24dcb1606': (52 commits)
  [CPS][Maps] Support CPS Picker in Maps  (elastic#246382)
  [APM] Migrate the Transaction Overview tests to Scout/Playwright/Component/API tests (elastic#245972)
  [Cases] Change nested field search to be case insensitive (elastic#246643)
  [ES|QL] PromQL parser initial implementation (elastic#246552)
  [Agent Builder] Adds keyboard shortcut and toggle behavior to AI Agent button (elastic#246659)
  Retry on "all shards failed" from ES (elastic#246533)
  [Streams] Test enable wired streams flow (elastic#246113)
  [Agent Builder] Fast-follow bugfixes for MCP Tool type  (elastic#246665)
  [Entity Store][API] Fix snake case on CRUD API List response (elastic#246003)
  [ResponseOps][Slack] Simplify channel configuration  (elastic#245423)
  Add Canonical Name Badge to Documentation (elastic#246647)
  [Streams] Add simulation filtering by conditions (elastic#245400)
  [o11y AI] Add `get_hosts` tool (elastic#246541)
  [agent builder] create_visualization: support heatmap and regionmap (elastic#246671)
  [AI Infra] Chat experience: Selection modal title change (elastic#246683)
  [Background search] Change polling behavior (elastic#244760)
  [ES|QL  ]  Common Lookup Join Fields Are Not Listed First (elastic#246582)
  Add missing `dynamic: false` (elastic#246685)
  [Metrics in Discover] Unskip metrics api test (elastic#246593)
  [ES|QL] Show next actions after simple field assignment in RERANK ON Clause (elastic#246676)
  ...
KodeRad pushed a commit to KodeRad/kibana that referenced this pull request Dec 17, 2025
Closes elastic#239909

🔒
[Figma](https://www.figma.com/design/C4o0y9Knk4jYXjsulvbW3w/Streams-Processors--Grok?m=auto&node-id=9552-33019&t=8ZoAb4lgexLLXxBg-1)

## Summary
This change adds an option to focus the simulation on a specific WHERE
condition by clicking on it or using the condition's context menu.


https://github.com/user-attachments/assets/6e443865-a679-46e1-b2cd-fec01a113655

## Key implementation details
* `_simulate` endpoint was modified to return a new `processed_by`
property on every document that went through simulation. This property
holds a list of processor IDs that affected a specific document.
* Using the new `processed_by` property, client code is able to filter
documents in the table when user clicks on a condition by collecting all
processors that are included in the selected condition (and implicitly
previous processors as they could be affecting the partial simulation)
and mapping them to the documents with `processed_by`.
* The documents are filtered on the client side which give a nice user
experience of an immediate feedback. Additionally, a partial simulation
is executed in the background in order to update the relative percentage
numbers for the table filters and the step list.
* A `baseSimulation` property was added to the simulation state machine
context in order to hold the full simulation containing all sample
documents. This is needed for cases when user switches between
conditions without clearing the previously condition selection, each
condition needs to have access to the full simulation in order to see
full documents list with `processed_by` properties.

> [!IMPORTANT]  
> Using the described approach with the `processed_by` property gives a
few key benefits:
> * It enables the client side filtering with immediate feedback
> * It avoids complicating the `_simulate` endpoint logic and keeps it
almost unchanged. If we would go with server side filtering, `_simulate`
endpoint would need to be aware of the concept of filtering by condition
in order to properly calculate the metrics for only the subset of the
documents.
> 
> At the same time, because we have no way to faithfully evaluate WHERE
conditions without running some special conditions-only simulation,
filtering of the documents is based on the processors (both included in
the target condition and implicitly processors from the previous steps)
and not strictly on the condition expression. This might produce
unexpected results in terms of visible documents for cases when, for
example, processors themselves include a condition which makes them
hidden from the table even though user selected a condition that matches
those documents. While technically correct, users might have questions.
For most users though, I expect the current UX to be transparent enough.

## How to test
1. Run Kibana with some data flowing into Streams
2. Create a processing steps with some conditions
3. Click on one of the conditions or use "Preview this only" item inside
its context menu
4. Check that documents left in the table correspond to the processors
you have inside the condition
5. Check that percentages displayed on the condition's steps and on the
table filters are correct
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport:skip This PR does not require backporting release_note:skip Skip the PR/issue when compiling release notes Team:obs-onboarding Observability Onboarding Team v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Streams] Feature: Implement interactive, clickable filtering for conditions

5 participants