Skip to content

ESQL: INLINESTATS docs#134480

Merged
elasticsearchmachine merged 12 commits intoelastic:mainfrom
astefan:inlinestats_docs
Sep 12, 2025
Merged

ESQL: INLINESTATS docs#134480
elasticsearchmachine merged 12 commits intoelastic:mainfrom
astefan:inlinestats_docs

Conversation

@astefan
Copy link
Copy Markdown
Contributor

@astefan astefan commented Sep 10, 2025

Fixes #124718

@astefan astefan added >docs General docs changes :Analytics/ES|QL AKA ESQL v9.2.0 labels Sep 10, 2025
@astefan astefan marked this pull request as ready for review September 10, 2025 18:24
@astefan astefan requested a review from bpintea September 10, 2025 18:24
@elasticsearchmachine elasticsearchmachine added Team:Docs Meta label for docs team Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) labels Sep 10, 2025
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/core-docs (Team:Docs)

@astefan astefan requested a review from leemthompo September 10, 2025 19:40
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Sep 11, 2025

@github-actions
Copy link
Copy Markdown
Contributor

ℹ️ Important: Docs version tagging

👋 Thanks for updating the docs! Just a friendly reminder that our docs are now cumulative. This means all 9.x versions are documented on the same page and published off of the main branch, instead of creating separate pages for each minor version.

We use applies_to tags to mark version-specific features and changes.

Expand for a quick overview

When to use applies_to tags:

✅ At the page level to indicate which products/deployments the content applies to (mandatory)
✅ When features change state (e.g. preview, ga) in a specific version
✅ When availability differs across deployments and environments

What NOT to do:

❌ Don't remove or replace information that applies to an older version
❌ Don't add new information that applies to a specific version without an applies_to tag
❌ Don't forget that applies_to tags can be used at the page, section, and inline level

🤔 Need help?

Copy link
Copy Markdown
Contributor

@bpintea bpintea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Andrei for taking on the docs!

Comment on lines +10 to +11
The command is identical to [`STATS`](/reference/query-languages/esql/commands/stats-by.md) except that it does not reduce
the number of columns in the output table.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternative / optional:

Suggested change
The command is identical to [`STATS`](/reference/query-languages/esql/commands/stats-by.md) except that it does not reduce
the number of columns in the output table.
The command is identical to [`STATS`](/reference/query-languages/esql/commands/stats-by.md) except that it preserves all the columns from the input table.

INLINE STATS [column1 =] expression1 [WHERE boolean_expression1][,
...,
[columnN =] expressionN [WHERE boolean_expressionN]]
[BY grouping_expression1[, ..., grouping_expressionN]]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is inherited, but [BY grouping_expression1[, ..., grouping_expressionN]] isn't as "detailed" as it could/should be:

[BY [grouping_name1 =] grouping_expression1[, 
    ..., 
    [grouping_nameN = ] grouping_expressionN]]

This is relevant below [§].

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, you are right. I've changed it.


`grouping_expressionX`
: An expression that outputs the values to group by.
If its name coincides with one of the computed columns, that column will be ignored.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[§] In this case, the "name collision" might not be clear, since, as given, that's en expression.

Also, for INLINE STATS we have the non-computed columns.

Suggested change
If its name coincides with one of the computed columns, that column will be ignored.
If its name coincides with one of the existing or computed columns, that column will be overridden by this one.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option: If the name matches an existing or computed column, this new column will replace it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you both. I've used @bpintea's suggestion.

:::{include} ../examples/inlinestats.csv-spec/avg-salaries-where.md
:::

Specifying the output column name is optional. If not specified, the new column
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: I'd drop this line. It's specified already in the synopsis and unlike for STATS, we don't add an example here.


**Limitations**

- [`CATEGORIZE`](/reference/query-languages/esql/functions-operators/grouping-functions.md#esql-categorize) grouping function is not
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- [`CATEGORIZE`](/reference/query-languages/esql/functions-operators/grouping-functions.md#esql-categorize) grouping function is not
- The [`CATEGORIZE`](/reference/query-languages/esql/functions-operators/grouping-functions.md#esql-categorize) grouping function is not



The [`STATS`](/reference/query-languages/esql/commands/stats-by.md) command supports these aggregate functions:
The [`STATS`](/reference/query-languages/esql/commands/stats-by.md) and [`INLINE STATS`](/reference/query-languages/esql/commands/inlinestats-by.md) command supports these aggregate functions:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The [`STATS`](/reference/query-languages/esql/commands/stats-by.md) and [`INLINE STATS`](/reference/query-languages/esql/commands/inlinestats-by.md) command supports these aggregate functions:
The [`STATS`](/reference/query-languages/esql/commands/stats-by.md) and [`INLINE STATS`](/reference/query-languages/esql/commands/inlinestats-by.md) commands support these aggregate functions:

Copy link
Copy Markdown
Member

@leemthompo leemthompo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, few (mainly language clarification) suggestions and couple questions from me :)

Comment on lines +69 to +70
- to: 'reference/query-languages/esql/commands/inlinestats-by.md'
anchors: {'esql-inlinestats-by'}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think you need this redirect, this was only required when we broke out the commands into standalone sub-pages, but this is a brand new page


`grouping_expressionX`
: An expression that outputs the values to group by.
If its name coincides with one of the computed columns, that column will be ignored.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option: If the name matches an existing or computed column, this new column will replace it.

If its name coincides with one of the computed columns, that column will be ignored.

`boolean_expressionX`
: The condition that must be met for a row to be included in the evaluation of `expressionX`.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
: The condition that must be met for a row to be included in the evaluation of `expressionX`.
: The condition that determines which rows are included when evaluating `expressionX`.

Comment on lines +47 to +52
The `INLINE STATS` processing command groups rows according to a common value
(what comes after `BY`) and calculates one or more aggregated values over the
grouped rows. The output table contains the same number of rows as the input
table and the command just adds new columns or overrides any existent ones with
the same name to the result. The resulting calculated values are matched to the
input rows according to the common value(s) (also known as grouping key(s)).
Copy link
Copy Markdown
Member

@leemthompo leemthompo Sep 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The `INLINE STATS` processing command groups rows according to a common value
(what comes after `BY`) and calculates one or more aggregated values over the
grouped rows. The output table contains the same number of rows as the input
table and the command just adds new columns or overrides any existent ones with
the same name to the result. The resulting calculated values are matched to the
input rows according to the common value(s) (also known as grouping key(s)).
The `INLINE STATS` processing command groups rows according to a common value
(also known as the grouping key), specified after `BY`, and calculates one or more
aggregated values over the grouped rows. The output table contains the same
number of rows as the input table. The command only adds new columns or overrides existing columns with the same name as the result.

suggestion for concision

Comment on lines +54 to +57
In case there are overlapping column names between the newly added columns and the
existing ones, besides overriding the existing columns, there can be a change in
the column order. The new columns are added/moved so that they appear in the order
they are defined in the `INLINE STATS` command.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In case there are overlapping column names between the newly added columns and the
existing ones, besides overriding the existing columns, there can be a change in
the column order. The new columns are added/moved so that they appear in the order
they are defined in the `INLINE STATS` command.
When using a `BY` clause, columns are reordered to match the structure of the
`INLINE STATS` command - calculated columns appear first, followed by grouping
columns.

I might (probably) have misunderstood, but are naming collisions + overrides separate to column reordering? If so my suggestion tries to clarify that, otherwise, just ignore :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my mind, the original wording explains inline stats behavior better. But there is a small change I need to make to the original text still.

besides overriding the existing columns, there can be

to

besides overriding the existing columns values, there can be

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd simplify the first sentence anyway for readability, maybe something like:

- In case there are overlapping column names between the newly added columns and the existing ones, besides overriding the existing columns, there can be a change in the column order.
+ If column names overlap, existing column values may be overridden and column order may change.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, much better. Changed.

Comment on lines +75 to +78
Calculating a statistic and grouping by the values of another column; note also
that `languages` column is moved as the last column in the output since it is
used as grouping key (the `KEEP` command before `INLINE STATS` had `languages`
set as the second column):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Calculating a statistic and grouping by the values of another column; note also
that `languages` column is moved as the last column in the output since it is
used as grouping key (the `KEEP` command before `INLINE STATS` had `languages`
set as the second column):
The following example shows how to calculate a statistic on one column and group
by the values of another column.
:::{note}
The `languages` column moves to the last position in the output table because it is
the grouping key.
:::

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

attempt to make the language flow clearer, take what you need if feel it isn't quite accurate!

Comment on lines +83 to +85
Omitting `BY` calculates the aggregation applied over the entire dataset, the
order of the existent columns is preserved and a new column with the calculated
maximum salary value is added as the last column:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Omitting `BY` calculates the aggregation applied over the entire dataset, the
order of the existent columns is preserved and a new column with the calculated
maximum salary value is added as the last column:
The following example shows how to calculate an aggregation over the entire dataset
by omitting `BY`. The order of the existing columns is preserved and a new column
with the calculated maximum salary value is added as the last column:

:::{include} ../examples/inlinestats.csv-spec/max-salary-without-by.md
:::

It’s possible to calculate multiple values in more complex queries:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
It’s possible to calculate multiple values in more complex queries:
The following example shows how to calculate multiple aggregations with multiple grouping keys:

:::{include} ../examples/inlinestats.csv-spec/multi-agg-multi-grouping.md
:::

To filter the rows that go into an aggregation, use the `WHERE` clause:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To filter the rows that go into an aggregation, use the `WHERE` clause:
The following example shows how to filter which rows are used for each aggregation, using the `WHERE` clause:

{{esql}} only supports the UTC timezone.


## INLINE STATS limitations [esql-limitations-inlinestats]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to duplicate limitations here? No strong opinions from me.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a preference for doing it like this, having some experience with users interaction for ES EQL and ES SQL projects where a separate Limitations page helped Support reason much better about some specific things not being available in those projects.

But, if there are other principles behind documentation for ES|QL, I can definitely follow them. Just le me know.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair enough 👍

and calculates one or more aggregated values over the grouped rows. The results
are appended as new columns to the input rows.

The command is identical to [`STATS`](/reference/query-languages/esql/commands/stats-by.md) except that it does not reduce
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we mention (somewhere) that the STATS will break down MVs fields it groups by, by individual SVs, while INLINE STATS links back to the MV field?
I'm referring to the behaviour noted in this test (which, IMO is correct, answering to the question there).
Though maybe that's intuitive.

Copy link
Copy Markdown
Contributor Author

@astefan astefan Sep 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I left this with no MV specification because I regarded that test result something to discuss further.

I added a mention in the csv-spec file about docs update when/if we decide on the MV behavior.

@astefan astefan requested a review from leemthompo September 12, 2025 12:30
Copy link
Copy Markdown
Member

@leemthompo leemthompo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @astefan!

@astefan astefan added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Sep 12, 2025
@elasticsearchmachine elasticsearchmachine merged commit f969349 into elastic:main Sep 12, 2025
34 checks passed
@astefan astefan deleted the inlinestats_docs branch September 12, 2025 14:19
gmjehovich pushed a commit to gmjehovich/elasticsearch that referenced this pull request Sep 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) >docs General docs changes Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) Team:Docs Meta label for docs team v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ESQL: Ensure good docs coverage for INLINESTATS

4 participants