What is the bug?
Incorrect results when doing join with aggregation, while the join key is not the first group by key. And we also need a special project like group by keys + agg metrics before join to trigger this bug.
How can one reproduce the bug?
PUT /test_bool_join
{
"mappings": {
"properties": {
"category": { "type": "keyword" },
"has_flag": { "type": "boolean" },
"value": { "type": "integer" }
}
}
}
POST test_bool_join/_bulk
{"index":{"_id":"1"}}
{"category":"A","has_flag":true,"value":10}
{"index":{"_id":"2"}}
{"category":"B","has_flag":true,"value":20}
{"index":{"_id":"3"}}
{"category":"C","has_flag":true,"value":30}
{"index":{"_id":"4"}}
{"category":"D","has_flag":false,"value":40}
{"index":{"_id":"5"}}
{"category":"E","has_flag":false,"value":50}
{"index":{"_id":"6"}}
{"category":"F","has_flag":false,"value":60}
POST _plugins/_ppl/_explain
{
"query": """
source=test_bool_join
| stats COUNT() as cnt by category, has_flag
| fields category, has_flag, cnt
| join left=L right=R ON L.has_flag = R.has_flag
[source=test_bool_join | stats COUNT() as overall_cnt by has_flag]
"""
}
It get incorrect results of 3 rows while the correct results is 6 rows.
What is the expected behavior?
Get correct results with 6 rows.
What is your host/environment?
- OS: macos
- Version 3.3.0-snapshot
- Plugins
Do you have any screenshots?
If applicable, add screenshots to help explain your problem.
Do you have any additional context?
This query without the special project could produce correct results.
source=test_join
| stats COUNT() as cnt by category, has_flag
| join left=L right=R ON L.has_flag = R.has_flag
[source=test_join | stats COUNT() as overall_cnt by has_flag]
And single query having this special project but without join could produce correct results.
source=test_bool_join
| stats COUNT() as cnt by category, has_flag
| fields category, has_flag, cnt
| sort has_flag
What is the bug?
Incorrect results when doing join with aggregation, while the join key is not the first group by key. And we also need a special project like
group by keys + agg metricsbefore join to trigger this bug.How can one reproduce the bug?
It get incorrect results of 3 rows while the correct results is 6 rows.
What is the expected behavior?
Get correct results with 6 rows.
What is your host/environment?
Do you have any screenshots?
If applicable, add screenshots to help explain your problem.
Do you have any additional context?
This query without the special project could produce correct results.
And single query having this special project but without join could produce correct results.