DRAFT: LOAD/NULLIFY golden tests by GalLalouche · Pull Request #141821 · elastic/elasticsearch

GalLalouche · 2026-02-04T11:36:11Z

No description provided.

alex-spies · 2026-02-04T12:12:05Z

...xpack/esql/analysis/golden_tests/AnalyzerUnmappedGoldenTests/testFork/load/analysis.expected

+  |       \_EsRelation[employees][avg_worked_seconds{f}#25, birth_date{f}#26, emp_no{f}#27, first_name{f}#28, gender{f}#29, height{f}#30, height.float{f}#31, height.half_float{f}#32, height.scaled_float{f}#33, hire_date{f}#34, is_rehired{f}#35, job_positions{f}#36, languages{f}#37, languages.byte{f}#38, languages.long{f}#39, languages.short{f}#40, last_name{f}#41, salary{f}#42, salary_change{f}#43, salary_change.int{f}#44, salary_change.keyword{f}#45, salary_change.long{f}#46, still_hired{f}#47, does_not_exist{f}#48]
+  \_Limit[1000[INTEGER],false,false]
+    \_Project[[avg_worked_seconds{f(EsField)}#50, birth_date{f(DateEsField)}#51, emp_no{f(EsField)}#52, first_name{f(KeywordEsField)}#53, gender{f(KeywordEsField)}#54, height{f(EsField)}#55, height.float{f(EsField)}#56, height.half_float{f(EsField)}#57, height.scaled_float{f(EsField)}#58, hire_date{f(DateEsField)}#59, is_rehired{f(EsField)}#60, job_positions{f(KeywordEsField)}#61, languages{f(EsField)}#62, languages.byte{f(EsField)}#63, languages.long{f(EsField)}#64, languages.short{f(EsField)}#65, last_name{f(KeywordEsField)}#66, salary{f(EsField)}#67, salary_change{f(EsField)}#68, salary_change.int{f(EsField)}#69, salary_change.keyword{f(KeywordEsField)}#70, salary_change.long{f(EsField)}#71, still_hired{f(EsField)}#72, does_not_exist{r}#73, _fork{r}#49]]
+      \_Eval[[null[KEYWORD] AS does_not_exist#73]]


This is wrong. The field is not loaded in one of the fork branches because it wasn't mentioned inside the branch itself. But it's still needed to load the values for the final output - we cannot just fill with nulls.

The point here is that with FORK, the underlying index is the same in all branches. We mustn't assume that an unmapped field exists and try to load it in one branch but not in the other - either we assume the index has the field or we don't.

Subqueries are less obvious but have a similar problem:

SET unmapped_fields="load"; FROM (employees), (employees | KEEP *, unmapped_field)

It's more reasonable to say that unmapped_field should only be loaded in the second subquery as the subqueries should be independent.

However, in one subquery we say "employees actually has an unmapped field that we should load!", while in the other we ignore it. That's probably fine, except that the query plans for subqueries and FORK are related java classes, so we need to be careful during planning to distinguish between these cases.

alex-spies · 2026-02-04T12:39:51Z

...k/esql/analysis/golden_tests/AnalyzerUnmappedGoldenTests/testSubquery/load/analysis.expected

+\_Project[[emp_no{r}#0, language_code{r}#1]]
+  \_UnionAll[[avg_worked_seconds{r}#2, birth_date{r}#3, emp_no{r}#0, first_name{r}#4, gender{r}#5, height{r}#6, height.float{r}#7, height.half_float{r}#8, height.scaled_float{r}#9, hire_date{r}#10, is_rehired{r}#11, job_positions{r}#12, languages{r}#13, languages.byte{r}#14, languages.long{r}#15, languages.short{r}#16, last_name{r}#17, salary{r}#18, salary_change{r}#19, salary_change.int{r}#20, salary_change.keyword{r}#21, salary_change.long{r}#22, still_hired{r}#23, language_code{r}#1, language_name{r}#24, does_not_exist{r}#25]]
+    |_Project[[avg_worked_seconds{f}#26, birth_date{f}#27, emp_no{f}#28, first_name{f}#29, gender{f}#30, height{f}#31, height.float{f}#32, height.half_float{f}#33, height.scaled_float{f}#34, hire_date{f}#35, is_rehired{f}#36, job_positions{f}#37, languages{f}#38, languages.byte{f}#39, languages.long{f}#40, languages.short{f}#41, last_name{f}#42, salary{f}#43, salary_change{f}#44, salary_change.int{f}#45, salary_change.keyword{f}#46, salary_change.long{f}#47, still_hired{f}#48, language_code{r}#49, language_name{r}#50, does_not_exist{r}#51]]
+    | \_Eval[[null[INTEGER] AS language_code#49, null[KEYWORD] AS language_name#50, null[KEYWORD] AS does_not_exist#51]]


2 Problems!

Eval[[null[INTEGER] AS language_code#49

IMHO this one should actually be loaded from source, but we need to verify that this the expected behavior. This is more general and pertains to the case when a field is mapped as non-KEYWORD in one index and missing in another

~~Even if we load from _source in this case, we still need to autocast as well ( to int in this case).~~

Update: This is #141912

We should fail queries like this for now, because loading unmapped fields and the alignment logic of UnionAll that inserts null Eval for missing attributes don't play well together. There are probably bugs, and we saw cases where the the load logic implied that a field missing in one subquery is of KEYWORD type, even though it was properly mapped in another subquery, causing mayhem during type resolution downstream.

In fact, we likely need to fail subqueries altogether until we have this figured out, otherwise we might allow queries that will later require a breaking change.

Here's a reproducer where the union all resolution for subqueries and loading unmapped fields works against each other - does_not_exist is both marked for loading AND it's overridden as NULL by the union all alignment logic (see analyzed plan):

FROM employees, (FROM languages | KEEP language_code) | KEEP emp_no, language_code, does_not_exist

Update to 2: rather than failing the whole query if a field is unmapped in one index and mapped as non-keyword in another but not explicitly cast, we probably should be consistent with union types which treat this as an unsupported field and fill it with nulls - rather than failing the whole query.

alex-spies

Oops, probably should've waited until this gets un-drafted. Only had a quick look and realized this is still WIP. Will continue reviewing when we're ready :)

alex-spies · 2026-02-12T10:35:45Z

.idea/inspectionProfiles/Project_Default.xml

Unrelated change; should this be committed?

alex-spies · 2026-02-12T10:37:38Z

...ugin/esql/qa/server/src/main/java/org/elasticsearch/xpack/esql/qa/rest/EsqlSpecTestCase.java

+        if (query.toUpperCase().startsWith("SET") == false) {
+            query = "SET unmapped_fields=\"load\"; " + query;
+        }


Leftover to remove before this can be merged.

alex-spies · 2026-02-12T10:39:59Z

x-pack/plugin/esql/qa/testFixtures/src/main/resources/data/all-types.csv

This also looks like an ad-hoc change; this csv was used for old tests and should probably be reverted.

alex-spies · 2026-02-12T10:41:34Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/action/EsqlCapabilities.java

+        /**
+         * Support for field aliases in mappings.
+         */
+        FIELD_ALIAS_SUPPORT,


Are we adding support for this in this PR? That should be an existing ability, no?

Not in CsvTests :)

alex-spies

Woops, I had uncommitted review comments. Committing now; they'll apply to whichever PR supersedes this, though.

alex-spies · 2026-02-27T10:23:28Z

...k/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/analysis/AnalyzerUnmappedTests.java

This file still has multiple // TODO: golden testing, where the test case is missing any actual assertions.

We should replace these test stubs by actual golden tests.

All but one of those have subqueries/forks/joins. Do we want to cement its behavior in golden tests?

Ah! Good catch. No, we probably don't, given that branching queries will be disabled for load for now.

But maybe we should mark them with @AwaitsFix and point to an issue? The tests run and "succeed" right now, giving a false sense of coverage, no?

alex-spies · 2026-03-02T14:58:40Z

x-pack/plugin/esql/qa/testFixtures/src/main/resources/unmapped-load.csv-spec

Drive-by, but this draft is mentioned in #141911 under "spec tests for -> TS", but I don't see any queries with TS. Let's add some?

GalLalouche · 2026-03-04T11:58:53Z

Moved to #143522 since the branch name was extremely misleading, and it made it hard on me to keep track.

) This PR increases the test coverage of unmapped fields and gildens pseudo-golden tests. * Moves pseudo-golden tests from AnalyzerUnmappedTests to AnalyzerUnmappedGoldenTests. * Add more golden tests. * Add more Spec/CSV tests. Continued from #141821.

…tic#143522) This PR increases the test coverage of unmapped fields and gildens pseudo-golden tests. * Moves pseudo-golden tests from AnalyzerUnmappedTests to AnalyzerUnmappedGoldenTests. * Add more golden tests. * Add more Spec/CSV tests. Continued from elastic#141821.

GalLalouche added 23 commits January 21, 2026 23:04

Late materialization golden tests

5036b99

Remove old test file

84976f5

Fix failing test

3410841

Small changes

4066eb4

Merge branch 'main' into tests/late_golden

54390d1

CR: Add minimum random transport version

5635f49

CR: Add ESQL query to the output

84a15de

Remove local node reduction tests in some cases

06d4661

Merge branch 'main' into tests/late_golden

2d87993

More CR Fixes

f82a95a

Merge branch 'main' into tests/late_golden

c4cec77

Fix bug with synthetic names

a0eace3

Add name IDs

5fec441

Synthetic names

a976a88

Refactor

6a98ff7

Golden test start

dd3ef6b

Fix set issue

e0b20db

TEMP

6aeccac

Fix query.esql

d47e18b

TEMP

465826c

Added more tests

4c8ae63

Add multi index tests

eefa9cf

TEMP

7f69234

GalLalouche requested a review from alex-spies February 4, 2026 11:36

elasticsearchmachine added the v9.4.0 label Feb 4, 2026

Push after potentially unmapped field hack

ac35237

alex-spies reviewed Feb 4, 2026

View reviewed changes

GalLalouche added 2 commits February 5, 2026 14:45

Add field aliases test

66cb1ed

Stuffs

221c4fa

GalLalouche mentioned this pull request Feb 5, 2026

ESQL: add tests and identify not-yet working cases for unmapped_fields="load" #141911

Closed

25 tasks

alex-spies mentioned this pull request Feb 6, 2026

ESQL: unmapped_fields="load" needs consistent behavior for branching commands (FORK, subqueries, views) #142033

Open

GalLalouche and others added 2 commits February 6, 2026 20:13

Added some nullify tests, fix CsvTests bug

9aaca40

[CI] Auto commit changes from spotless

7a3b810

alex-spies reviewed Feb 12, 2026

View reviewed changes

This was referenced Feb 27, 2026

ESQL: support for mapping-unavailable fields meta-issue #138888

Closed

ESQL: Prevent pushdown of unmapped fields in filters and sorts #143460

Merged

ES|QL: Fix KQL/QSTR with unmapped fields in NULLIFY mode #143399

Merged

GalLalouche closed this Mar 3, 2026

GalLalouche deleted the tests/golden_name_ids branch March 3, 2026 19:41

alex-spies reviewed Mar 4, 2026

View reviewed changes

GalLalouche mentioned this pull request Mar 4, 2026

ESQL: Increases unmapped fields test coverage (golden and spec) #143522

Merged

Conversation

GalLalouche commented Feb 4, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alex-spies Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alex-spies left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alex-spies left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

GalLalouche commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

alex-spies Feb 4, 2026 •

edited

Loading