-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Is your feature request related to a problem or challenge?
Today DataFusion supports three aggregate functions that can be "order aware": ARRAY_AGG, FIRST_VALUE and LAST_VALUE. This means that you can supply a ORDER BY clause to their argument, for example FIRST_VALUE(x ORDER BY time).
Today, there be only one single order specified across ALL order aware aggregate functions
For example
❯ create table t(x int, y int) as values (1, 1), (1, 2), (1, 1), (1, 4), (2, 20), (2, 10);;
0 rows in set. Query took 0.003 seconds.
❯ select x, first_value(x ORDER BY y) from t GROUP BY x;
+---+------------------+
| x | FIRST_VALUE(t.x) |
+---+------------------+
| 2 | 2 |
| 1 | 1 |
+---+------------------+
2 rows in set. Query took 0.004 seconds.
❯ select x, first_value(x ORDER BY y), first_value(x ORDER BY y DESC) from t GROUP BY x;
+---+------------------+-----------------+
| x | FIRST_VALUE(t.x) | LAST_VALUE(t.x) |
+---+------------------+-----------------+
| 1 | 1 | 1 |
| 2 | 2 | 2 |
+---+------------------+-----------------+
2 rows in set. Query took 0.004 seconds.
❯ select x, first_value(x ORDER BY y), first_value(x ORDER BY y DESC NULLS LAST) from t GROUP BY x;
This feature is not implemented: Conflicting ordering requirements in aggregate functions is not supportedDescribe the solution you'd like
There are a few designs proposed here: #8558 (comment)
We are working on a more detailed proposal
Describe alternatives you've considered
No response
Additional context
No response
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request