SQL: Generate relevant error message when grouping functions are not used in GROUP BY#38017
SQL: Generate relevant error message when grouping functions are not used in GROUP BY#38017astefan merged 10 commits intoelastic:masterfrom
Conversation
|
Pinging @elastic/es-search |
costin
left a comment
There was a problem hiding this comment.
The check needs to be extended so it traverses the tree in case the grouping function is embedded.
For example SELECT YEAR(HISTOGRAM(date, 1 YEAR)) FROM x GROUP BY HISTOGRAM(date, 1 YEAR).
Additionally (and this might be a separate PR), it's worth investigating how we handle functions over histograms FROM x GROUP BY YEAR(HISTOGRAM(date, 1 YEAR)).
matriv
left a comment
There was a problem hiding this comment.
Left a coupe of questions, otherwise LGTM
|
|
||
| proj.projections().forEach(e -> { | ||
| if (Functions.isGrouping(e) == false) { | ||
| e.collectFirstChildren(c -> { |
There was a problem hiding this comment.
collectFirstChildren uses a List to add the matches which we don't need in this impl.
Why don't you use forEachDown?
There was a problem hiding this comment.
Changed the approach and pushed a new commit.
|
|
||
| a.aggregates().forEach(as -> { | ||
| Expression exp = as instanceof Alias ? ((Alias) as).child() : as; | ||
| if (Functions.isGrouping(exp) && false == Expressions.anyMatch(a.groupings(), g -> exp.semanticEquals(g))) { |
There was a problem hiding this comment.
what if the histogram is wrapped with a Scalar function? is it caught somewhere else?
There was a problem hiding this comment.
This was already handled. See the test here.
properly in the Verifier due to functions equality not working correctly.
|
@costin I pushed a new commit. |
|
@elasticmachine run elasticsearch-ci/2 |
2 similar comments
|
@elasticmachine run elasticsearch-ci/2 |
|
@elasticmachine run elasticsearch-ci/2 |
costin
left a comment
There was a problem hiding this comment.
Left a few comments, I'm unclear on why extraChildren is needed...
| Project proj = (Project) p; | ||
| proj.projections().forEach(e -> e.forEachDown(f -> localFailures.add(fail(f, "[%s] needs to be part of the grouping", | ||
| Expressions.name(f))), GroupingFunction.class)); | ||
| } else if (p instanceof Aggregate) { |
There was a problem hiding this comment.
this block is a bit condensed and could use some whitespace and indentation to make it more readable.
For example put localFailures on a new line.
| // if it does have a GROUP BY, check if the groupings contain the grouping functions (Histograms) | ||
| Aggregate a = (Aggregate) p; | ||
| a.aggregates().forEach(agg -> agg.forEachDown(e -> { | ||
| if (Expressions.anyMatch(a.groupings(), g -> {return g instanceof Function && e.functionEquals((Function) g);}) == false |
There was a problem hiding this comment.
anyMatch(a.groupings(), g -> g instanceof Function && functionEquals((Function) g)
Avoids the use of {, return and ;
Also like above it will be readable if you put each statement on a new line
|
|
||
| protected GroupingFunction(Source source, Expression field) { | ||
| this(source, field, emptyList()); | ||
| protected GroupingFunction(Source source, Expression field, List<Expression> extraChildren) { |
There was a problem hiding this comment.
extraChildren is meaningless - use arguments instead (to preserve the terminology from the parent) or parameters instead.
|
|
||
| protected GroupingFunction(Source source, Expression field, List<Expression> parameters) { | ||
| super(source, CollectionUtils.combine(singletonList(field), parameters)); | ||
| protected GroupingFunction(Source source, Expression field, List<Expression> extraChildren, List<Expression> parameters) { |
There was a problem hiding this comment.
Why is there extraChildren and parameters?
| public Histogram(Source source, Expression field, Expression interval, ZoneId zoneId) { | ||
| super(source, field); | ||
| this.interval = (Literal) interval; | ||
| super(source, field, Arrays.asList(interval)); |
There was a problem hiding this comment.
Why not have this passed through parameters instead of extraChildren?
There was a problem hiding this comment.
Minor improvement, use: Collections.singletonList(interval)
|
@costin the idea is to make |
|
@costin addressed your comments. Initially I looked at |
| public Histogram(Source source, Expression field, Expression interval, ZoneId zoneId) { | ||
| super(source, field); | ||
| this.interval = (Literal) interval; | ||
| super(source, field, Arrays.asList(interval)); |
There was a problem hiding this comment.
Minor improvement, use: Collections.singletonList(interval)
| return resolution; | ||
| } | ||
|
|
||
There was a problem hiding this comment.
Really minor: remove the whitespaces.
…round-sync-6.x * elastic/6.x: Fix testRestoreIncreasesPrimaryTerms on 6.x (elastic#38314) SQL: Remove exceptions from Analyzer (elastic#38260) (elastic#38287) SQL: Move metrics tracking inside PlanExecutor (elastic#38259) (elastic#38288) Backport of elastic#38311: Move TokenService to seqno powered cas Handle scheduler exceptions (elastic#38183) Mute MlMigrationFullClusterRestartIT#testMigration (elastic#38316) 6.x Backport of elastic#38278: Move ML Optimistic Concurrency Control to Seq No Cleanup construction of interceptors (elastic#38296) Throw if two inner_hits have the same name (elastic#37645) (elastic#38194) AsyncTwoPhaseIndexerTests race condition fixed elastic#38195 Backport#37830 Enable SSL in reindex with security QA tests (elastic#38293) Ensure ILM policies run safely on leader indices (elastic#38140) Introduce ssl settings to reindex from remote (elastic#38292) Fix ordering problem in add or renew lease test (elastic#38281) Mute ReplicationTrackerRetentionLeaseTests#testAddOrRenewRetentionLease (elastic#38276) Fix NPE in Logfile Audit Filter (elastic#38120) (elastic#38271) Enable trace log in FollowerFailOverIT (elastic#38148) SQL: Generate relevant error message when grouping functions are not used in GROUP BY (elastic#38017)
* 6.6: (121 commits) [DOCS] Add warning about bypassing ML PUT APIs (elastic#38608) fix dissect doc "ip" --> "clientip" (elastic#38512) bad formatted JSON object (elastic#38515) SQL: Fix issue with IN not resolving to underlying keyword field (elastic#38440) Update ilm-api.asciidoc, point to REMOVE policy (elastic#38235) Backport changes to the release notes script. (elastic#38347) Change the milliseconds precision to 3 digits for intervals. (elastic#38297) SecuritySettingsSource license.self_generated: trial (elastic#38233) (elastic#38398) Fix IndexAuditTrail rolling upgrade on rollover edge 2 (elastic#38286) (elastic#38381) Cleanup construction of interceptors (elastic#38388) Skip unsupported languages for tests (elastic#38328) (elastic#38385) [ILM][TEST] increase assertBusy timeout (elastic#36864) (elastic#38354) Docs: Drop inline callout from scroll example (elastic#38340) (elastic#38365) Preserve ILM operation mode when creating new lifecycles (elastic#38134) (elastic#38230) [ML] Add explanation so far to file structure finder exceptions (elastic#38337) ML: Fix error race condition on stop _all datafeeds and close _all jobs (elastic#38113) (elastic#38211) (elastic#38222) SQL: Generate relevant error message when grouping functions are not used in GROUP BY (elastic#38017) Fix NPE in Logfile Audit Filter (elastic#38120) (elastic#38273) Enable trace log in FollowerFailOverIT (elastic#38148) Replace awaitBusy with assertBusy in atLeastDocsIndexed (elastic#38190) ...
Grouping functions (HISTOGRAM, so far) have a requirement of being used inside the GROUP BY. This PR adds checks to make sure this happens. Otherwise, an error message is generated.
Fixes #37952.