feat(api) Add support for column list in aggregate functions#408
feat(api) Add support for column list in aggregate functions#408
Conversation
c2b75de to
4476cef
Compare
fpacifici
left a comment
There was a problem hiding this comment.
Just to confirm my understanding of this PR is right.
you want to ask Snuba
....[argMax, [(a,b,c), d]]
and, as of now, a,b,c columns would not be processed because they are one parameter of your function but the tuple does not get decomposed into columns.
Is that correct ?
snuba/util.py
Outdated
| return complex_column_expr(dataset, column_name, body) | ||
| elif isinstance(column_name, str) and QUOTED_LITERAL_RE.match(column_name): | ||
| return escape_literal(column_name[1:-1]) | ||
| elif isinstance(column_name, (list, tuple)): |
There was a problem hiding this comment.
I would move this case into complex_column_expr since this method is meant to deal with one individual column.
I want to be able to ask for |
|
Did you try this way ?
I suspect that should be trasformed into argaMax(('id', 'timestamp'), 'latest_event') |
|
I didn't try |
|
I see. groupby and selected columns are processed this way: While aggregations are processed differently: By expecting that an aggregate function will take one parameter only. Do you know if this is intentional ? |
I'm not sure to be honest — this (specifically the inconsistencies between |
|
So, while we figure this out, @markstory , could you please move the logic that manages this use case into complex_column_expr. We could at least try to be consistent with what column_expr does (manage one column) and what complex_column_expr does. |
|
@fpacifici I've had to retain the top-level condition in |
46d3e3e to
c227b48
Compare
| if is_function(column_name, 0): | ||
| return complex_column_expr(dataset, column_name, body) | ||
| elif isinstance(column_name, (list, tuple)) and aggregate: | ||
| return complex_column_expr(dataset, [aggregate, column_name, alias], body) |
There was a problem hiding this comment.
does it work if you just call this directly from api.py skipping column_expr ? Ideally this would work like for line 149 for functions, but for some reason we are already unpacking the expression before calling column_expr for aggregates so we cannot do that.
There was a problem hiding this comment.
Going directly to complex_column_expr from api.py ends up breaking aggregation handling for things like ['topK(4)', 'issue', 'aggregate'] as the topK(4) pattern isn't recognized as a function, until it gets into function_expr which isn't in use by complex_column_expr.
We have a use-case in sentry where we want to use `argMax()` which requires multiple columns to work. While this function is allowed in `selected_columns` it is actually an aggregate value and should probably be in `aggregations`. This expands the schema and query builder to allow lists of columns in aggregations functions. Refs getsentry/sentry#14321
c227b48 to
2c81329
Compare
We have a use-case in sentry where we want to use
argMax()which requires multiple columns to work. While this function is allowed inselected_columnsit is actually an aggregate value and should probably be inaggregations. This expands the schema and query builder to allow lists of columns in aggregations functions.This may all be a bad idea as I'm a snuba noob.
Refs getsentry/sentry#14321