[Streams] Add simulation filtering by conditions#245400
Conversation
8506adc to
8d73b2b
Compare
8d73b2b to
b8bcdc9
Compare
|
Pinging @elastic/obs-onboarding-team (Team:obs-onboarding) |
|
/ci |
…ure-implement-interactive-clickable-filtering-for-conditions
|
I've made a couple of changes after our discussion during the weekly. The partial simulation after selection a condition now receives only documents directly related to the target condition while steps are still collected from the condition itself an all previous steps. So when the condition is selected, the table now shows only relevant documents and processor metrics are isolated to that document set which makes it way more intuitive to use. CleanShot.2025-12-11.at.10.30.57.mp4 |
…ure-implement-interactive-clickable-filtering-for-conditions
…ure-implement-interactive-clickable-filtering-for-conditions
|
I've adapted the condition logic to the new state machines structure after the changes made for YAML mode, added a hover state for eligible conditions and fix a couple bugs found by Luca. |
There was a problem hiding this comment.
Looks good for the most part, nice work👏
Some bugs 🐛
Clicking the "x" on the "Selected" button in YAML mode doesn't do anything:
Given the following:
- A condition step, with one nested processor.
- Clicking the condition to select it.
- Moving to YAML mode.
- Moving back to interactive mode.
- Adding a new top level processor.
The new processor is added to the steps for simulation, e.g.:
Expected behaviour:
-
YAML mode is not honouring the selected condition when it sends it's steps for simulation. There should be feature parity between the two.
-
On the whole it seems to be able to get into states that are just incorrect, e.g. here steps are being simulated that shouldn't be, and I see selected 100% which shouldn't be true.
Enhancements:
The ability to filter by condition could be added to YAML mode alongside the "Run up to step" button:
Bugs? 🐛
These ones are "are they bugs or not?"
-
We're not representing the new information, e.g. the new status (excluded due to condition etc) in YAML mode, these exist in the gutter markers and so on.
-
I'm also concerned a little by drift on the IDs, just thinking aloud here. When in YAML mode and content is edited the steps get assigned deterministic IDs based on their position in the step hierarchy.
-
Run simulation
└── processed_by: ['step_abc123', 'step_def456'] -
Click condition (baseSimulation frozen)
└── baseSimulation.processed_by still references 'step_abc123', etc. -
Edit YAML / regenerate steps
└── New IDs: 'step_xyz789', 'step_uvw012'
└── But baseSimulation still has old IDs -
Try to filter by condition
└── Looks for 'step_xyz789' in processed_by
└── Finds nothing 💥 (I think)
| /* Find insert index based on step hierarchy */ | ||
| export function findInsertIndex(stepRefs: StepActorRef[], parentId: string | null): number { | ||
| // Find the index of the parent step | ||
| // debugger; |
| * Recursively collects all descendant step IDs | ||
| * for a given parent step ID. | ||
| */ | ||
| export function collectDescendantStepIds( |
There was a problem hiding this comment.
There's quite a few instances of this, can we unify them?
couvq
left a comment
There was a problem hiding this comment.
Great work!
non-blocking, general ux comment: To me it wasn't obvious at first that the preceding steps were enabled by the simulation filtering until I played around with the UI a bit, I'm wondering if it would make sense to also highlight those steps as well? @patpascal
…ure-implement-interactive-clickable-filtering-for-conditions
Thank you Quentin, that's a fair point, was also pointed out in the original issue. The decision was to go with this UI (highlighting only the directly selected condition) to simplify the UX. Highlighting all parent conditions when user clicks on one of the children might also be confusing, though in the future we might consider something like highlighting the connecting lines instead of whole blocks. |
|
@Kerry350 Thank you for the review, I obviously didn't test it thoroughly with YAML mode, sorry about that 🙈 I talked with @LucaWintergerst and we agreed to go without filtering in YAML mode in order to not delay this feature as it would require more work both in the code and on the UX side. It's a bit unfortunate that we didn't plan for YAML mode support beforehand and got kind of blindsided. I've added a logic to reset all filtering when user switches into YAML mode that should resolve the issues you've mentioned. Please take another look when have time 🙌 |
💛 Build succeeded, but was flaky
Failed CI StepsMetrics [docs]Module Count
Public APIs missing comments
Async chunks
History
|
Kerry350
left a comment
There was a problem hiding this comment.
LGTM based on the fact we're only supporting this in interactive mode.
…donly * commit 'bb1f55fa520b30ceb923af069ef403b24dcb1606': (52 commits) [CPS][Maps] Support CPS Picker in Maps (elastic#246382) [APM] Migrate the Transaction Overview tests to Scout/Playwright/Component/API tests (elastic#245972) [Cases] Change nested field search to be case insensitive (elastic#246643) [ES|QL] PromQL parser initial implementation (elastic#246552) [Agent Builder] Adds keyboard shortcut and toggle behavior to AI Agent button (elastic#246659) Retry on "all shards failed" from ES (elastic#246533) [Streams] Test enable wired streams flow (elastic#246113) [Agent Builder] Fast-follow bugfixes for MCP Tool type (elastic#246665) [Entity Store][API] Fix snake case on CRUD API List response (elastic#246003) [ResponseOps][Slack] Simplify channel configuration (elastic#245423) Add Canonical Name Badge to Documentation (elastic#246647) [Streams] Add simulation filtering by conditions (elastic#245400) [o11y AI] Add `get_hosts` tool (elastic#246541) [agent builder] create_visualization: support heatmap and regionmap (elastic#246671) [AI Infra] Chat experience: Selection modal title change (elastic#246683) [Background search] Change polling behavior (elastic#244760) [ES|QL ] Common Lookup Join Fields Are Not Listed First (elastic#246582) Add missing `dynamic: false` (elastic#246685) [Metrics in Discover] Unskip metrics api test (elastic#246593) [ES|QL] Show next actions after simple field assignment in RERANK ON Clause (elastic#246676) ...
Closes elastic#239909 🔒 [Figma](https://www.figma.com/design/C4o0y9Knk4jYXjsulvbW3w/Streams-Processors--Grok?m=auto&node-id=9552-33019&t=8ZoAb4lgexLLXxBg-1) ## Summary This change adds an option to focus the simulation on a specific WHERE condition by clicking on it or using the condition's context menu. https://github.com/user-attachments/assets/6e443865-a679-46e1-b2cd-fec01a113655 ## Key implementation details * `_simulate` endpoint was modified to return a new `processed_by` property on every document that went through simulation. This property holds a list of processor IDs that affected a specific document. * Using the new `processed_by` property, client code is able to filter documents in the table when user clicks on a condition by collecting all processors that are included in the selected condition (and implicitly previous processors as they could be affecting the partial simulation) and mapping them to the documents with `processed_by`. * The documents are filtered on the client side which give a nice user experience of an immediate feedback. Additionally, a partial simulation is executed in the background in order to update the relative percentage numbers for the table filters and the step list. * A `baseSimulation` property was added to the simulation state machine context in order to hold the full simulation containing all sample documents. This is needed for cases when user switches between conditions without clearing the previously condition selection, each condition needs to have access to the full simulation in order to see full documents list with `processed_by` properties. > [!IMPORTANT] > Using the described approach with the `processed_by` property gives a few key benefits: > * It enables the client side filtering with immediate feedback > * It avoids complicating the `_simulate` endpoint logic and keeps it almost unchanged. If we would go with server side filtering, `_simulate` endpoint would need to be aware of the concept of filtering by condition in order to properly calculate the metrics for only the subset of the documents. > > At the same time, because we have no way to faithfully evaluate WHERE conditions without running some special conditions-only simulation, filtering of the documents is based on the processors (both included in the target condition and implicitly processors from the previous steps) and not strictly on the condition expression. This might produce unexpected results in terms of visible documents for cases when, for example, processors themselves include a condition which makes them hidden from the table even though user selected a condition that matches those documents. While technically correct, users might have questions. For most users though, I expect the current UX to be transparent enough. ## How to test 1. Run Kibana with some data flowing into Streams 2. Create a processing steps with some conditions 3. Click on one of the conditions or use "Preview this only" item inside its context menu 4. Check that documents left in the table correspond to the processors you have inside the condition 5. Check that percentages displayed on the condition's steps and on the table filters are correct
Closes #239909
🔒 Figma
Summary
This change adds an option to focus the simulation on a specific WHERE condition by clicking on it or using the condition's context menu.
CleanShot.2025-12-16.at.12.40.18.mp4
Key implementation details
_simulateendpoint was modified to return a newprocessed_byproperty on every document that went through simulation. This property holds a list of processor IDs that affected a specific document.processed_byproperty, client code is able to filter documents in the table when user clicks on a condition by collecting all processors that are included in the selected condition (and implicitly previous processors as they could be affecting the partial simulation) and mapping them to the documents withprocessed_by.baseSimulationproperty was added to the simulation state machine context in order to hold the full simulation containing all sample documents. This is needed for cases when user switches between conditions without clearing the previously condition selection, each condition needs to have access to the full simulation in order to see full documents list withprocessed_byproperties.Important
Using the described approach with the
processed_byproperty gives a few key benefits:_simulateendpoint logic and keeps it almost unchanged. If we would go with server side filtering,_simulateendpoint would need to be aware of the concept of filtering by condition in order to properly calculate the metrics for only the subset of the documents.At the same time, because we have no way to faithfully evaluate WHERE conditions without running some special conditions-only simulation, filtering of the documents is based on the processors (both included in the target condition and implicitly processors from the previous steps) and not strictly on the condition expression. This might produce unexpected results in terms of visible documents for cases when, for example, processors themselves include a condition which makes them hidden from the table even though user selected a condition that matches those documents. While technically correct, users might have questions. For most users though, I expect the current UX to be transparent enough.
How to test