Skip to content

Disallow unmapped_fields="load" with full-text search/MATCH#143810

Merged
shmuelhanoch merged 4 commits intoelastic:mainfrom
shmuelhanoch:141927-esql-disallow-unmapped-load-with-match
Mar 17, 2026
Merged

Disallow unmapped_fields="load" with full-text search/MATCH#143810
shmuelhanoch merged 4 commits intoelastic:mainfrom
shmuelhanoch:141927-esql-disallow-unmapped-load-with-match

Conversation

@shmuelhanoch
Copy link
Copy Markdown
Contributor

  • Fail at analysis instead of returning empty results.

  • Add unit test testUnmappedFieldsLoadWithMatchOperatorFails in AnalyzerUnmappedTests.

Closes #141927

@shmuelhanoch shmuelhanoch force-pushed the 141927-esql-disallow-unmapped-load-with-match branch from 80a166b to 324bc55 Compare March 9, 2026 07:02
@shmuelhanoch shmuelhanoch marked this pull request as ready for review March 11, 2026 16:18
@elasticsearchmachine elasticsearchmachine added the needs:triage Requires assignment of a team area label label Mar 11, 2026
@shmuelhanoch shmuelhanoch requested a review from alex-spies March 11, 2026 16:18
@shmuelhanoch shmuelhanoch force-pushed the 141927-esql-disallow-unmapped-load-with-match branch from 324bc55 to db073ff Compare March 11, 2026 16:20
@elasticsearchmachine elasticsearchmachine added Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) and removed needs:triage Requires assignment of a team area label labels Mar 11, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@shmuelhanoch shmuelhanoch added >bug needs:triage Requires assignment of a team area label and removed Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) labels Mar 11, 2026
@elasticsearchmachine elasticsearchmachine added Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) and removed needs:triage Requires assignment of a team area label labels Mar 11, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Hi @shmuelhanoch, I've created a changelog YAML for you.

@shmuelhanoch shmuelhanoch force-pushed the 141927-esql-disallow-unmapped-load-with-match branch from 7c13171 to 17c80e0 Compare March 11, 2026 16:47
* Reproducer for #141927: with unmapped_fields=load, full-text search (match operator) must fail at analysis
* instead of returning empty results.
*/
public void testUnmappedFieldsLoadWithMatchOperatorFails() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add few more tests here? Not necessarily more test functions, because we can verificationFailure(...) many times within the same function and pass many queries, I'd suspect there is a good number of interesting cases we want covered.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, I added one for each forbidden full-text function.

throw new VerificationException(
List.of(
Failure.fail(
fullTextFunctions.get(0),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like only the first function gets reported. But also, is finding a full-text function anywhere in query necessary and sufficient to reject it? Or do we want the reject only if they're applied to the unmapped field?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only rejecting if applied to an unmapped field would certainly be better, but I wouldn't go there for now.

There's FTFs like KQL which can refer to an unmapped field inside the query string, as in WHERE KQL("author: Faulkner") (taken straight from our docs). author might be unmapped and it's not fully clear how we should behave then.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to report every full-text function in the plan.

Copy link
Copy Markdown
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @shmuelhanoch ! This gets the job done, but I'd like to increase test coverage and would also like to place the new verification into a slightly more appropriate place before merging. I added details below :)

BitSet partialMetrics = new BitSet(FeatureMetric.values().length);
return verify(execute(plan), gatherPreAnalysisMetrics(plan, partialMetrics));
LogicalPlan analyzed = execute(plan);
if (context().unmappedResolution() == UnmappedResolution.LOAD) {
Copy link
Copy Markdown
Contributor

@alex-spies alex-spies Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the best place to add an additional verification. Analyzer#analyze is a pretty high-level method which has a simple workflow:

  1. analyze
  2. verify and gather usage data
    We should keep it this way and not add low-level verifications to high-level execution logic.

We have 2 typical places for verifications that come to mind:

  • inside Verifier#verify (there's a section marked with Concrete verifications)
  • inside functions and commands themselves by implementing PostAnalysisPlanVerificationAware and PostAnalysisVerificationAware.

Now, neither of these places have access to the query settings. We could plug the query settings in there, and in fact it might become required anyway.

However, I think there's a simpler way.

We could add a verification step inside Verifier#verify which checks for the presence of full-text functions in case that the load setting had any effect on the query plan. More specifically, load will place FieldAttribute instances into the EsRelation which contain a PotentiallyUnmappedKeywordEsField. Such an attribute is thus marked as "will try to load from _source during execution". If we find any such field attribute in an EsRelation, we know that load was used, we do try to load some fields from _source, and we can thus fail the query if it contains disallowed functions or commands.

This also means that we should add positive tests for queries where load did not have any effect (because all fields in the query were, in fact, mapped), because full-text functions will not be forbidden in such queries.

Copy link
Copy Markdown
Contributor

@alex-spies alex-spies Mar 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: #144115 adds a new section inside Verifier#verify where we disallow not-yet allowed commands. I think we can wait on #144115 and move this PR's new validation next to it.

This is a little simpler than my original suggestion in that it will keep this PR's logic: disallow full-text functions if the query uses SET unmapped_fields="load". No need to check EsRelations for the presence of PotentiallyUnmappedKeywordEsFields in FieldAttributes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I moved the check to a Verifier#verify.

Comment on lines +3304 to +3305
* Reproducer for #141927: with unmapped_fields=load, full-text search (match operator) must fail at analysis
* instead of returning empty results.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Appreciate the comment pointing to the issue!

super nit: I opened #144121 to explain that we plan to increase support for full-text functions. We could link to it as well so that future-us knows that the test will need changes in the future.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linked #144121 in the test Javadoc!

@shmuelhanoch shmuelhanoch force-pushed the 141927-esql-disallow-unmapped-load-with-match branch 2 times, most recently from c81800e to 2990547 Compare March 16, 2026 03:22
Copy link
Copy Markdown
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there! I have only few final comments and, once addressed, this certainly LGTM!

}

if (unmappedResolution == UnmappedResolution.LOAD) {
checkLoadModeDisallowedCommands(plan, failures);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the existing check by Andrei.

I suggest we add another helper method checkLoadModeDisallowedFunctions and call it here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, extracted into checkLoadModeDisallowedFunctions here

Comment on lines +148 to +150
// Disallow full-text search when unmapped_fields=load. We do not restrict to "only when the FTF
// is applied to an unmapped field" because FTFs like KQL can reference unmapped fields inside the
// query string (e.g. KQL("author: Faulkner")), and the desired behavior there is unclear.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++, thanks for commenting with the reason for the code here. I think this will help future us (and others!) understand the choices we made.

List<FullTextFunction> fullTextFunctions = new ArrayList<>();
plan.forEachExpressionDown(FullTextFunction.class, fullTextFunctions::add);
if (fullTextFunctions.isEmpty() == false) {
String message = "unmapped_fields=\"load\" is not allowed with full-text search (MATCH, match operator `:`, "
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than mentioning MATCH specifically in the error message, why don't we tell the user which full-text function they used?

Ideally, our tests should also assert that the right full text function was mentioned in the error message.

When reporting user input in error messages, we typically enclose it in brackets, like so:

unmapped_fields="load" does not support full-text search function [KQL]; use "fail" or "nullify"

except that the KQL needs to be replaced by the actual function that was used.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I fixed this to show the actual function name and check it in the tests

@shmuelhanoch shmuelhanoch force-pushed the 141927-esql-disallow-unmapped-load-with-match branch 2 times, most recently from 74504ad to 19c271f Compare March 17, 2026 07:05
Copy link
Copy Markdown
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks @shmuelhanoch !

…lastic#141927)

Fail at analysis instead of returning empty results. Add unit test
testUnmappedFieldsLoadWithMatchOperatorFails in AnalyzerUnmappedTests.
- Move unmapped_fields=load + full-text verification from Analyzer to
  Verifier (new verify(plan, partialMetrics, unmappedResolution) overload).
- Report all full-text function sites in the plan, not only the first.
- Test each forbidden full-text function: match op, match, match_phrase,
  multi_match, qstr, kql, knn (capability-guarded where needed).
- Add verificationFailureWithMapping helper for KNN test; Javadoc elastic#144121.
- Extract full-text check into checkLoadModeDisallowedFunctions helper
- Error message shows actual function name in brackets (e.g. [KQL], [MATCH])
- Tests assert correct function name in each failure message
@shmuelhanoch shmuelhanoch force-pushed the 141927-esql-disallow-unmapped-load-with-match branch from 19c271f to 57aaf9b Compare March 17, 2026 17:39
@shmuelhanoch shmuelhanoch enabled auto-merge (squash) March 17, 2026 17:39
@shmuelhanoch shmuelhanoch merged commit b52f511 into elastic:main Mar 17, 2026
36 checks passed
michalborek pushed a commit to michalborek/elasticsearch that referenced this pull request Mar 23, 2026
…143810)

* ESQL: fail when unmapped_fields=load is used with full-text search (elastic#141927)

Fail at analysis instead of returning empty results. Add unit test
testUnmappedFieldsLoadWithMatchOperatorFails in AnalyzerUnmappedTests.

* Review fixes:

- Move unmapped_fields=load + full-text verification from Analyzer to
  Verifier (new verify(plan, partialMetrics, unmappedResolution) overload).
- Report all full-text function sites in the plan, not only the first.
- Test each forbidden full-text function: match op, match, match_phrase,
  multi_match, qstr, kql, knn (capability-guarded where needed).
- Add verificationFailureWithMapping helper for KNN test; Javadoc elastic#144121.

* Fix Verifier Javadoc and AnalyzerUnmappedTests for CI

* More review fixes:

- Extract full-text check into checkLoadModeDisallowedFunctions helper
- Error message shows actual function name in brackets (e.g. [KQL], [MATCH])
- Tests assert correct function name in each failure message
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >non-issue Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ESQL: disallow unmapped_fields="load" with full-text search/MATCH

4 participants