Skip to content

ESQL: Fetch min transport version we're going to target when resolving fields#135633

Merged
nik9000 merged 18 commits intoelastic:mainfrom
nik9000:esql_fetch_transportversions
Oct 7, 2025
Merged

ESQL: Fetch min transport version we're going to target when resolving fields#135633
nik9000 merged 18 commits intoelastic:mainfrom
nik9000:esql_fetch_transportversions

Conversation

@nik9000
Copy link
Copy Markdown
Member

@nik9000 nik9000 commented Sep 29, 2025

Relates #131108

Fetch min transport version we're going to target when resolving fields.

@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Sep 29, 2025
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

),
listener
);
executeRequest(task, request, listener);
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made @dnhatn add this the last time we changed this. I have copy-and-paste blindness and I said "please don't copy and paste this file, please make hooks, even if it's more work." Because I can never see the changes buried in copy-and-pasted code. And he, being a good sport, did.

Now I'm making a change and copy and pasting. I tried making hooks and though "this is unreadable garbage". It would have been so much worse. Not sure if I was wrong, but I'm pretty sure the hooks are just too hard to read once you make a change like this.

Now! The big copy-and-pasted bit is doExecuteForked - I could make a class DoExecuteForked with places we could override. That feels like it might be worth it. I'm not sure! I didn't do it so folks could look here and have opinions.

@nik9000 nik9000 requested a review from dnhatn September 29, 2025 15:13
)
)
listener.delegateFailureAndWrap((l, response) -> {
LOGGER.debug("minimum transport version {}", response.minTransportVersion());
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is all the PR does! It just plugs it in.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Er, well, it just leaves this here so we can plug it in after.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the idea adding the min version to the Configuration of the query? Some other mechanism to propagate it?
I'm thinking about how this will be implemented in the sum agg overflow case (Which happened to work based on the Configuration in the old PR)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Step 1 for me is to use it with fields right here. But step 2 is to drag it out, presumably via IndexResolution and then onwards however we decide. I'm not sure if it should be a thing that is serialized back down to the data nodes. I'd kind of like to keep it on the coordinating node so we're flexible about changing it later.

Copy link
Copy Markdown
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left one comment, but this looks good. Thank you for taking care of this, Nik!

if (response.minTransportVersion() == null) {
minTransportVersion.set(null);
} else {
minTransportVersion.accumulateAndGet(response.minTransportVersion(), TransportVersion::min);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may hit NullPointerException if we receive a response from an old cluster then another response from a new cluster. Also, can we handle both branches within a single accumulateAndGet call to avoid data races when processing two responses concurrently?

listener.onResponse(new FieldCapabilitiesResponse(Collections.emptyList(), Collections.emptyList()));
}
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am afraid of idea of copying that much if TransportFieldCapabilitiesAction especially while it is being actively changed for cps/flat resolution.

Have we considered collecting versions using

for each cluster alias:
    transportService.getRemoteClusterService().getConnection(clusterAlias).getTransportVersion()
and
    clusterService.state().getMinTransportVersion()

This should be possible after resolving all cluster aliases participating in the query.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can return an incorrect minTransportVersion if the remote cluster is a mixed-version cluster.

Copy link
Copy Markdown
Contributor

@idegtiarenko idegtiarenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not comfortable with copying such amount of code that is currently changing.
Lets try exploring alternative approach.

@dnhatn
Copy link
Copy Markdown
Member

dnhatn commented Sep 30, 2025

If you prefer to delay copying this code, we can add minTransportVersion to the existing field-caps response instead of the new resolve response.

@nik9000
Copy link
Copy Markdown
Member Author

nik9000 commented Oct 1, 2025

If you prefer to delay copying this code, we can add minTransportVersion to the existing field-caps response instead of the new resolve response.

We talked about that and it didn't feel worth it. Only ESQL wants it and we're worried folks will see it and think "oh, this is how I get that version!" But the request is expensive. I'll see if I can work through the copy and paste today.

)
)
listener.delegateFailureAndWrap((l, response) -> {
LOGGER.debug("minimum transport version {}", response.minTransportVersion());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the idea adding the min version to the Configuration of the query? Some other mechanism to propagate it?
I'm thinking about how this will be implemented in the sum agg overflow case (Which happened to work based on the Configuration in the old PR)

import java.io.IOException;

public class EsqlResolveFieldsResponse extends ActionResponse {
private static final TransportVersion CREATED = TransportVersion.fromName("esql_resolve_fields_response_created");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: The field name feels quite generic. Even if it's only used locally, I would go with something like RESOLVE_FIELDS_RESPONSE_CREATED or RESOLVE_FIELDS_RESPONSE_CREATED_TV.
Even if the new TVs fields are safe as private fields, they're still quite important to end up mixing the names by some misunderstanding or typo.
Just my 2 cents!

concreteIndices = indexNameExpressionResolver.concreteIndexNames(projectState.metadata(), localIndices);
}

if (concreteIndices.length == 0 && remoteClusterIndices.isEmpty()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're not requesting the remote minTV if the remote has no indices? Do we assume that "no indices" means "no remote" or something like that?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't be sending to that remote at all in that case. Checking.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is for "we didn't match any local indices and didn't detect any remotes to check".

But, in general, yeah, if we reach out to a remote to run the thing we use it's version. Even if no indices match the expression. Just trying to use the remote opts you in to checking its version.

@nik9000 nik9000 requested a review from idegtiarenko October 3, 2025 12:29
@nik9000
Copy link
Copy Markdown
Member Author

nik9000 commented Oct 3, 2025

@idegtiarenko and I've talked over slack and zoom and stuff. He's on holiday today so I'll summarize the conversation here.

He's convinced we need something, that does this, though not convinced this is the right way to do it. I'm more convinced than he is, but that's how conversations go. Either way, we can use this for now, and either keep using it for a long time (my guess) or modify it later (@idegtiarenko's guess). But, for now, this gives us something to fix a lot of the problems we have. So I'll try to get this in today or Monday.

@nik9000 nik9000 added auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) v9.2.1 labels Oct 3, 2025
@nik9000 nik9000 added auto-backport Automatically create backport pull requests when merged and removed auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) labels Oct 7, 2025
@nik9000
Copy link
Copy Markdown
Member Author

nik9000 commented Oct 7, 2025

I have an approval from Nhat and a passing build. I'll get this in. I think Github is confused because it's making me "bypass rules". Or maybe our rule configuration is confused. I dunno, but I think merging is good here.

@nik9000 nik9000 merged commit 7720e63 into elastic:main Oct 7, 2025
34 checks passed
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

💔 Backport failed

Status Branch Result
9.2 Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 135633

nik9000 added a commit to nik9000/elasticsearch that referenced this pull request Oct 7, 2025
…g fields (elastic#135633)

Relates elastic#131108

Fetch min transport version we're going to target when resolving fields.
nik9000 added a commit that referenced this pull request Oct 15, 2025
…g fields (#135633) (#136132)

Relates #131108

Fetch min transport version we're going to target when resolving fields.
@idegtiarenko
Copy link
Copy Markdown
Contributor

Was this backported?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL auto-backport Automatically create backport pull requests when merged >non-issue Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.2.1 v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants