Skip to content

ESQL: Different approach to keeping track of fields coming from indices#143944

Merged
elasticsearchmachine merged 16 commits intoelastic:mainfrom
astefan:generativeIT_refine5
Mar 12, 2026
Merged

ESQL: Different approach to keeping track of fields coming from indices#143944
elasticsearchmachine merged 16 commits intoelastic:mainfrom
astefan:generativeIT_refine5

Conversation

@astefan
Copy link
Copy Markdown
Contributor

@astefan astefan commented Mar 10, 2026

This updates the Generative IT infra to keep track of fields that come from indices, helpful especially for full-text functions which, for now, can work only on fields that come from real indices.
Also, there is some cleanup of errors that shouldn't be valid anymore, as the corresponding issues have been closed.
AI-assisted PR.

@astefan astefan added >test Issues or PRs that are addressing/adding tests :Analytics/ES|QL AKA ESQL v9.4.0 labels Mar 10, 2026
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Mar 10, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@astefan astefan requested a review from luigidellaquila March 10, 2026 13:46
Copy link
Copy Markdown
Member

@luigidellaquila luigidellaquila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @astefan, I like the approach very much.
This is not an easy topic, probably as complex as field resolution.

I left a couple of comments regarding how we deal with existing columns, I think there is room for improvement, but this is already a step forward compared to the previous implementation.

}
default -> {
// For commands that don't create named columns (KEEP, DROP, SORT, LIMIT, WHERE, etc.),
// any column not in previous is from the command (e.g. ENRICH, LOOKUP_JOIN, CHANGE_POINT)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately this is only half of the story: a column could be in the previous output and still be overwritten by these commands.
To be safe, we should let single commands define which columns they "touched" and invalidate all of them; worst case, we can invalidate all the columns for commands we don't know. This will restrict the test surface a lot, but it will avoid false negatives.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for thinking more about this aspect. I appreciate it.

I think this all boils down to the strategy we want/each adopts with generative testing: I tend to start with fewer restrictions and adjust along the way. This surfaces issues for new things we add (like full-text functions support, like new commands being added). If the adjustments become a burden, then more like a shotgun approach is needed (like the one you mentioned - columns "touched" become invalid). The drawback is the constant monitoring of generative testing.
For the moment, I'll keep the code here as is and see what/if things become unmanageable.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense 👍

Set<String> createdColumns = new HashSet<>();

switch (commandName) {
case "eval" -> {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also had a special check for MV_EXPAND before.
Is an expanded column still valid for full text functions?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it is not. I rushed into removing the mv_expand check seeing that the issue is partially fixed, but that specific issue had a fix that forbade mv_expand with full-text functions. I added back the check.

Copy link
Copy Markdown
Member

@luigidellaquila luigidellaquila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@astefan astefan added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Mar 11, 2026
@elasticsearchmachine elasticsearchmachine merged commit a4de207 into elastic:main Mar 12, 2026
36 checks passed
@astefan astefan deleted the generativeIT_refine5 branch March 12, 2026 10:03
jfreden pushed a commit to jfreden/elasticsearch that referenced this pull request Mar 12, 2026
…es (elastic#143944)

This updates the Generative IT infra to keep track of fields that come
from indices, helpful especially for full-text functions which, for now,
can work only on fields that come from real indices. Also, there is some
cleanup of errors that shouldn't be valid anymore, as the corresponding
issues have been closed. AI-assisted PR.
michalborek pushed a commit to michalborek/elasticsearch that referenced this pull request Mar 23, 2026
…es (elastic#143944)

This updates the Generative IT infra to keep track of fields that come
from indices, helpful especially for full-text functions which, for now,
can work only on fields that come from real indices. Also, there is some
cleanup of errors that shouldn't be valid anymore, as the corresponding
issues have been closed. AI-assisted PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) >test Issues or PRs that are addressing/adding tests v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants