[Security_Solution][Endpoint] Leveraging msearch and ancestry array for resolver#70134
Conversation
|
Pinging @elastic/endpoint-data-visibility-team (Team:Endpoint Data Visibility) |
|
Pinging @elastic/endpoint-app-team (Feature:Endpoint) |
| percentWithRelated: 100, | ||
| relatedEvents: 0, | ||
| relatedAlerts: 0, | ||
| useAncestryArray: true, |
There was a problem hiding this comment.
These changes are in this PR: #70129
they will be removed once the other PR gets merged.
| EndpointStatus, | ||
| } from './types'; | ||
| import { factory as policyFactory } from './models/policy_config'; | ||
| import { parentEntityId } from './models/event'; |
There was a problem hiding this comment.
These changes are in this PR: #70129
they will be removed once the other PR gets merged.
| * Elasticsearch's SearchResponse<ResolverEvent> response. | ||
| */ | ||
| export abstract class ResolverQuery<T> implements MSearchQuery { | ||
| export abstract class ResolverQuery<T, R = ResolverEvent> implements MSearchQuery { |
There was a problem hiding this comment.
The R here sets the stage for being able to limit the _source of a queries response. This will be in a follow up PR. Limit the source will be a performance improvement for querying for the start events for the children.
| formatResponse(response: SearchResponse<ResolverEvent>): PaginatedResults { | ||
| return { | ||
| results: ResolverQuery.getResults(response), | ||
| totals: PaginationBuilder.getTotals(response.aggregations), |
There was a problem hiding this comment.
Totals are inferred based on the response of ES while using the ancestry array.
| /** | ||
| * a function to handle the response | ||
| */ | ||
| handler: (response: SearchResponse<ResolverEvent>) => void; |
There was a problem hiding this comment.
This calls into the class (e.g. RelatedEventsQueryHandler etc) for it to handle the response from ES.
| // are ordered correctly to be retrieved in a single call | ||
| const distantChildEntityID = Array.from(tree.childrenLevels[0].values())[0].id; | ||
| const { body }: { body: ResolverChildren } = await supertest | ||
| .get(`/api/endpoint/resolver/${tree.origin.id}/children?generations=1`) |
There was a problem hiding this comment.
The generator creates nodes in depth first orderish. I don't want to have to rely on that though in the test. To get around that I move to a level of the tree where I know there aren't any other descendants before the immediate children.
| LegacyEndpointEvent, | ||
| ResolverNodeStats, | ||
| ResolverRelatedAlerts, | ||
| ResolverChildNode, |
There was a problem hiding this comment.
TODO need to write more tests for the children's pagination.
| * 1. At the time of querying it could have no children in ES, in which case it will be marked as | ||
| * null because we know it does not have children during this query. | ||
| * 2. If the max level was reached we do not know if this node has children or not so we'll mark it as null | ||
| * nextChild can have 3 different states: |
There was a problem hiding this comment.
Just a thought, I understand what this is saying, especially after reading through your write up, but one of @oatkiller's famous diagrams may be useful here, you could probs just copy paste one he's already done. Not necessary, but I think it could be helpful
There was a problem hiding this comment.
Thanks @michaelolo24 do you mean an ascii diagram of a tree? Or a table?
There was a problem hiding this comment.
I think what I'll probably do is put a link to the relevant section in the docs that I'll be adding as a follow up or add on to this PR.
|
|
||
| return { | ||
| query: this.query, | ||
| ids: this.entityID, |
There was a problem hiding this comment.
does the msearch query take strings and arrays or just strings? I know in your constructor it's looking for a string, I'm just confused by the pluralized id
There was a problem hiding this comment.
It's both string | string[] https://github.com/jonathan-buttner/kibana/blob/msearch-ancestry-array/x-pack/plugins/security_solution/server/endpoint/routes/resolver/queries/multi_searcher.ts#L36
And some of the other *_query_handler files pass it string[] particularly the children one.
| /** | ||
| * Get the results after an msearch. | ||
| */ | ||
| getResults() { |
There was a problem hiding this comment.
Is this only used internally to the class?
There was a problem hiding this comment.
No it's used by the fetch.ts file as well: https://github.com/jonathan-buttner/kibana/blob/msearch-ancestry-array/x-pack/plugins/security_solution/server/endpoint/routes/resolver/utils/fetch.ts#L178
| // bucket the start and end events together for a single node | ||
| const ancestryNodes = this.toMapOfNodes(results); | ||
|
|
||
| // the order of this array is going to be weird, it will look like this |
There was a problem hiding this comment.
not necessary, but I feel like a mini diagram could be helpful here too
There was a problem hiding this comment.
I'll add the diagram in a follow up PR so I don't have to wait for ci again haha
| return children; | ||
| } | ||
|
|
||
| function getStartEventsFromLevels(levels: Array<Map<string, TreeNode>>) { |
There was a problem hiding this comment.
Wonder what the performance here is gonna look like at scale 😅.
There was a problem hiding this comment.
This is just a test file :)
|
@elasticmachine merge upstream |
|
merge conflict between base and head |
💚 Build SucceededBuild metrics
History
To update your PR or re-run it, just comment with: |
…or resolver (#70134) (#70583) * Refactor generator for ancestry support * Adding optional ancestry array * Refactor the pagination since the totals are not used anymore * Updating the queries to not use aggregations for determining the totals * Refactoring the children helper to handle pagination without totals * Pinning the seed for the resolver tree generator service * Splitting the fetcher into multiple classes for msearch * Updating tests and api for ancestry array and msearch * Adding more comments and fixing type errors * Fixing resolver test import * Fixing tests and type errors * Fixing type errors and tests * Removing useAncestry field * Fixing test * Removing useAncestry field from tests * An empty array will be returned because that's how ES will do it too
* master: Update known-plugins.asciidoc (elastic#69370) [ML] Anomaly Explorer swim lane pagination (elastic#70063) [Ingest Manager] Update asset paths to use _ instead of - (elastic#70320) Fix discover, tsvb and Lens chart theming issues (elastic#69695) Allow Saved Object type mappings to set a field's `doc_values` property (elastic#70433) [S&R] Support data streams (elastic#68078) [Maps] Add styling and tooltip support to mapbox mvt vector tile sources (elastic#64488) redirect to default app if hash can not be forwarded (elastic#70417) [APM] Don't fetch dynamic index pattern in setupRequest (elastic#70308) [Security_Solution][Endpoint] Leveraging msearch and ancestry array for resolver (elastic#70134) Update docs for api authentication usage (elastic#66819) chore(NA): disable alerts_detection_rules cypress suites (elastic#70577) add getVisibleTypes API to SO type registry (elastic#70559)
This PR leverages msearch and the ancestry array to improve the resolver backend's performance. It relies on the generator changes in this PR: #70129
Notable changes:
nextChildcursors for the children is different since we're leveraging the ancestry array. I've tried to outline the different scenarios in the code's commentsgenerationssince it can't be enforcing while using the ancestry array. The limit is instead base don the number of nodes you'd like to receive