Skip to content

ESQL: For INLINEJOIN, push down INLINEJOIN to data nodes when it makes sense #124743

@alex-spies

Description

@alex-spies

For queries like ... | INLINESTATS total_count = COUNT(*) where the InlineJoin has no BY clause, the INLINEJOIN that's performed in the 2nd phase of execution is actually equivalent with an EVAL total_count = <some literal number>. In this case, it makes no sense to perform a proper join, and it also does not make sense to force the inline join onto the coordinator - said EVAL can easily be pushed down to data nodes, thus increasing parallelization.

The same is true for a more general INLINESTATS ... BY group_field1, group_field2 in the case when it turns out that there are only very few combinations of group_field1, group_field2, so that it would be better to perform the InlineJoin on the data nodes.

Let's attempt this optimization + have optimizer tests that document correct behavior here.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions