Skip to content

EQL sequence and join execution #49594

@colings86

Description

@colings86

Once we have EQL sequence/join parts transpiled to ES Search DSL (#49590) we need to implement logic to perform the join and find results matching the sequence/join. Since joins can be thought of as unordered sequences we can think mainly about the execution of sequences and apply the same logic to joins.

The basic concept is execute a sequence similar to how Lucene executes conjunction queries. We have N streams of hits (where each stream corresponds to an element in the sequence) sorted by the join key(s) and then by timestamp. We pick a leading iterator (ideally the stream with the fewest hits), look at the first result and try to skip to the join key for that result on the other streams. If the key exists in all the streams we can iterate through the events with that key in the streams to find a matching sequence (potentially using this same skipping/leapfrogging approach with the timestamp). If the key does not exist in all the streams we know there is no matching sequence for the key so can skip to the next key on the leading iterator.

The above idea needs to be modified slightly to account for the fact that Elasticsearch returns results in pages rather than a continuous stream. Within a single page of results for each stream we can follow the logic above. When we reach the end of a page of results we have two cases:

  1. we are in the middle of trying to match on a key - In this case we use search_after to request a new page of results with the value of search_after being the join key and timestamp from the leading iterator for the key we are trying to find.
  2. We are looking for the next key - In this case we can use the values from the last join key(s) in search_after to find the next candidate.

As matches are found they can be added to the response. The match should contain the documents for each element in the sequence (there will be a separate issue regarding the response format)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions