-
Notifications
You must be signed in to change notification settings - Fork 190
[BUG] Push down redundant filter for time span #4811
Description
What is the bug?
Currently, for time span agg, we always view it as bucket_nullalbe=false and won't show null bucket after this PR:#4327. We implements this by adding another filter operator on the time field before aggregation.
After push down span by using date_histogram, it actually won't have null bucket already. But we still push down a redundant filter into the scan. e.g.
// PPL
source=events | stats count() by span(@timestamp, 1d)
// Final plan
CalciteEnumerableIndexScan(table=[[OpenSearch, events]], PushDownContext=[[PROJECT->[@timestamp], FILTER->IS NOT NULL($0), AGGREGATION->rel#630:LogicalAggregate.NONE.[](input=RelSubset#629,group={0},count()=COUNT()), PROJECT->[count(), span(@timestamp,1d)], LIMIT->10000], OpenSearchRequestBuilder(sourceBuilder={"from":0,"size":0,"timeout":"1m","query":{"exists":{"field":"@timestamp","boost":1.0}},"_source":{"includes":["@timestamp"],"excludes":[]},"aggregations":{"composite_buckets":{"composite":{"size":10000,"sources":[{"span(@timestamp,1d)":{"date_histogram":{"field":"@timestamp","missing_bucket":false,"order":"asc","fixed_interval":"1d"}}}]}}}}, requestedTotalSize=2147483647, pageSize=null, startFrom=0)])
It has FILTER->IS NOT NULL($0) in the PushDownContext and "query":{"exists":{"field":"@timestamp","boost":1.0}} in the DSL.
It will introduce more performance downgrade if the bucket field is a derived field, which will generate script in the DSL query.
How can one reproduce the bug?
- Create a index with time field, e.g.
PUT localhost:9200/events
{
"mappings": {
"properties": {
"@timestamp": {
"type": "date"
},
"host": {
"type": "text"
},
"cpu_usage": {
"type": "double"
},
"region": {
"type": "keyword"
}
}
}
}
- Run explain on a query with time span, e.g.
source=events | stats count() by span(@timestamp, 1d)
What is the expected behavior?
The final plan shouldn't contain the filter derived from time span aggregation.
What is your host/environment?
- OS: [e.g. iOS]
- Version 3.4-SNAPSHOT
- Plugins
Do you have any screenshots?
If applicable, add screenshots to help explain your problem.
Do you have any additional context?
Add any other context about the problem.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status