Skip to content

Updating HealthService to use FetchHealthInfoCacheAction#4

Closed
masseyke wants to merge 16 commits intofeature/health-api-fetch-health-infofrom
feature/health-api-healthinfo
Closed

Updating HealthService to use FetchHealthInfoCacheAction#4
masseyke wants to merge 16 commits intofeature/health-api-fetch-health-infofrom
feature/health-api-healthinfo

Conversation

@masseyke
Copy link
Copy Markdown
Owner

This PR updates HealthService to fetch HealthInfo data using FetchHealthInfoCacheAction, and to pass it to the HealthIndicatorServices' calculate method.

@masseyke masseyke closed this Sep 8, 2022
masseyke pushed a commit that referenced this pull request Aug 22, 2023
Fixes elastic/elasticsearch-internal#497
Fixes ESQL-560

A query like `from test | sort data | limit 2 | project count` fails
because `LocalToGlobalLimitAndTopNExec` planning rule adds a collecting
`TopNExec` after last GATHER exchange, to perform last reduce, see

```
TopNExec[[Order[data{f}#6,ASC,LAST]],2[INTEGER]]
\_ExchangeExec[GATHER,SINGLE_DISTRIBUTION]
  \_ProjectExec[[count{f}#4]]      // <- `data` is projected away but still used by the TopN node above
    \_FieldExtractExec[count{f}#4]
      \_TopNExec[[Order[data{f}#6,ASC,LAST]],2[INTEGER]]
        \_FieldExtractExec[data{f}#6]
          \_ExchangeExec[REPARTITION,FIXED_ARBITRARY_DISTRIBUTION]
            \_EsQueryExec[test], query[][_doc_id{f}#9, _segment_id{f}#10, _shard_id{f}#11]
```

Unfortunately, at that stage the inputs needed by the TopNExec could
have been projected away by a ProjectExec, so they could be no longer
available.

This PR adapts the plan as follows:
- add all the projections used by the `TopNExec` to the existing
`ProjectExec`, so that they are available when needed
- add another ProjectExec on top of the plan, to project away the
originally removed projections and preserve the query semantics


This approach is a bit dangerous, because it bypasses the mechanism of
input/output resolution and validation that happens on the logical plan.
The alternative would be to do this manipulation on the logical plan,
but it's probably hard to do, because there is no concept of Exchange at
that level.
masseyke pushed a commit that referenced this pull request Jan 23, 2026
…tic#140027)

This PR fixes the issue where `INLINE STATS GROUP BY null` was being
incorrectly pruned by `PruneLeftJoinOnNullMatchingField`.

Fixes elastic#139887

## Problem For query:

```
FROM employees
| INLINE STATS c = COUNT(*) BY n = null
| KEEP c, n
| LIMIT 3
```

During `LogicalPlanOptimizer`:

```
Limit[3[INTEGER],false,false]
\_EsqlProject[[c{r}#2, n{r}#4]]
  \_InlineJoin[LEFT,[n{r}#4],[n{r}#4]]
    |_Eval[[null[NULL] AS n#4]]
    | \_EsRelation[employees][<no-fields>{r$}#7]
    \_Aggregate[[n{r}#4],[COUNT(*[KEYWORD],true[BOOLEAN],PT0S[TIME_DURATION]) AS c#2, n{r}#4]]
      \_StubRelation[[<no-fields>{r$}#7, n{r}#4]]
```

The following join node:

```
InlineJoin[LEFT,[n{r}#4],[n{r}#4]]
|_Eval[[null[NULL] AS n#4]]
| \_EsRelation[employees][<no-fields>{r$}#7]
\_Aggregate[[n{r}#4],[COUNT(*[KEYWORD],true[BOOLEAN],PT0S[TIME_DURATION]) AS c#2, n{r}#4]]
  \_StubRelation[[<no-fields>{r$}#7, n{r}#4]]
```

should NOT have `PruneLeftJoinOnNullMatchingField` applied, because the
right side is an `Aggregate` (originating from `INLINE STATS`). Since
`STATS` supports `GROUP BY null`, the join key being null is a valid use
case. Pruning this join would incorrectly eliminate the aggregation
results, changing the query semantics.

During `LocalLogicalPlanOptimizer`:

```
ProjectExec[[c{r}#2, n{r}#4]]
\_LimitExec[3[INTEGER],null]
  \_ExchangeExec[[c{r}#2, n{r}#4],false]
    \_FragmentExec[filter=null, estimatedRowSize=0, reducer=[], fragment=[<>
Project[[c{r}#2, n{r}#4]]
\_Limit[3[INTEGER],false,false]
  \_InlineJoin[LEFT,[n{r}#4],[n{r}#4]]
    |_Eval[[null[NULL] AS n#4]]
    | \_EsRelation[employees][<no-fields>{r$}#7]
    \_LocalRelation[[c{r}#2, n{r}#4],Page{blocks=[LongVectorBlock[vector=ConstantLongVector[positions=1, value=100]], ConstantNullBlock[positions=1]]}]<>]]
```

The following join node:

```
InlineJoin[LEFT,[n{r}#4],[n{r}#4]]
|_Eval[[null[NULL] AS n#4]]
| \_EsRelation[employees][<no-fields>{r$}#7]
\_LocalRelation[[c{r}#2, n{r}#4],Page{blocks=[LongVectorBlock[vector=ConstantLongVector[positions=1, value=100]], ConstantNullBlock[positions=1]]}]
```

should NOT have `PruneLeftJoinOnNullMatchingField` applied, because the
right side is a `LocalRelation` (the `Aggregate` was optimized into a
`LocalRelation` containing the pre-computed aggregation results).
Pruning this join when the join key is null would discard the valid
aggregation results stored in the `LocalRelation`, incorrectly producing
null values instead of the expected count.

## Solution The fix ensures that `PruneLeftJoinOnNullMatchingField` only
applies to `LOOKUP JOIN` nodes, where `join.right()` is an `EsRelation`.
For `INLINE STATS` joins, the right side can be:

 - `Aggregate` (before optimization), or
 - `LocalRelation` (after the aggregate is optimized)

By checking `join.right() instanceof EsRelation`, we correctly skip the
pruning optimization for `INLINE STATS` joins, preserving the expected
query results when grouping by null.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant