Handle missing values in painless#30975
Conversation
4f83586 to
2ee048c
Compare
|
@elasticmachine run gradle build tests |
|
Pinging @elastic/es-core-infra |
|
@elasticmachine run gradle build tests |
Throw an exception for `doc['field'].value` if this document is missing a value for the `field`. For 7.0: This is the default behaviour from 7.0 For 6.x: To enable this behavior from 6.x, a user can set a jvm.option: `-Des.script.exception_for_missing_value=true` on a node. If a user does not enable this behavior, a deprecation warning is logged on start up. Closes elastic#29286
2ee048c to
ca9f0a5
Compare
rjernst
left a comment
There was a problem hiding this comment.
This is much simpler! I have a few comments.
modules/lang-painless/build.gradle
Outdated
| } | ||
|
|
||
| Task additionalClusterTest = tasks.create( | ||
| name: "additionalClusterTest", type: RestIntegTestTask) |
There was a problem hiding this comment.
Can we use a less generic name? Something specific to testing doc values missing values?
modules/lang-painless/build.gradle
Outdated
| distribution = 'integ-test-zip' | ||
| module project // add the lang-painless module itself | ||
| module project.project(':modules:mapper-extras') | ||
| } |
There was a problem hiding this comment.
What tests is this actually running? It seems like this is running all the same rest tests. I would expect a test with in this PR which checks a deprecation message is emitted when the sysprop is false.
| bar: | ||
| script: | ||
| source: "(doc['missing'].value?.length() ?: 0) + params.x;" | ||
| source: "(doc['missing'].size() == 0 ? 0 : doc['missing'].value?.length()) + params.x;" |
There was a problem hiding this comment.
I think the value?.length() can just be value.length() since it can't be null?
modules/mapper-extras/build.gradle
Outdated
| classname 'org.elasticsearch.index.mapper.MapperExtrasPlugin' | ||
| } | ||
| test.configure { | ||
| systemProperty 'es.script.exception_for_missing_value', 'true' |
There was a problem hiding this comment.
Maybe it would be better to set this for all tests (eg in BuildPlugin), and override it for one test to check the bwc behavior?
|
@rjernst Thanks for your feedback. Can you please continue to review when you have time. I have tried to address your comments. Instead of integration tests with different system settings, I have decided to add unit tests. |
rjernst
left a comment
There was a problem hiding this comment.
This looks good (just one minor comment). However, I would also like to ask that the system property be changed to use the prefix es.scripting, instead of es.script. This is to match that I used in #31441, so that we have consistency for scripting related bwc sysprops.
| - `null` if a `field` has a geo datatype | ||
| - `""` if a `field` has a binary datatype | ||
|
|
||
| IMPORTANT: Starting in 7.0, `doc['field'].value` throws an exception if |
There was a problem hiding this comment.
It would be good to note somewhere here that the user can call doc['field'].size() == 0 to check if values are missing.
|
@elasticmachine run gradle build tests |
|
This PR had a broken link that I've fixed in de27365 |
Throw an exception for `doc['field'].value` if this document is missing a value for the `field`. For 7.0: This is the default behaviour from 7.0 For 6.x: To enable this behavior from 6.x, a user can set a jvm.option: `-Des.script.exception_for_missing_value=true` on a node. If a user does not enable this behavior, a deprecation warning is logged on start up. Closes elastic#29286
* master: [TEST] Mute SlackMessageTests.testTemplateRender Docs: Explain closing the high level client [ML] Re-enable memory limit integration tests (#31328) [test] disable packaging tests for suse boxes Add nio transport to security plugin (#31942) XContentTests : Insert random fields at random positions (#30867) Force execution of fetch tasks (#31974) Fix unreachable error condition in AmazonS3Fixture (#32005) Tests: Fix SearchFieldsIT.testDocValueFields (#31995) Add Expected Reciprocal Rank metric (#31891) [ML] Get ForecastRequestStats doc in RestoreModelSnapshotIT (#31973) SQL: Add support for single parameter text manipulating functions (#31874) [ML] Ensure immutability of MlMetadata (#31957) Tests: Mute SearchFieldsIT.testDocValueFields() muted tests due to #31940 Work around reported problem in eclipse (#31960) Move build integration tests out of :buildSrc project (#31961) Tests: Remove use of joda time in some tests (#31922) [Test] Reactive 3rd party tests on CI (#31919) SQL: Support for escape sequences (#31884) SQL: HAVING clause should accept only aggregates (#31872) Docs: fix typo in datehistogram (#31972) Switch url repository rest tests to new style requests (#31944) Switch reindex tests to new style requests (#31941) Docs: Added note about cloud service to installation and getting started [DOCS] Removes alternative docker pull example (#31934) Add Snapshots Status API to High Level Rest Client (#31515) ingest: date_index_name processor template resolution (#31841) Test: fix null failure in watcher test (#31968) Switch test framework to new style requests (#31939) Switch low level rest tests to new style Requests (#31938) Switch high level rest tests to new style requests (#31937) [ML] Mute test failing due to Java 11 date time format parsing bug (#31899) [TEST] Mute SlackMessageTests.testTemplateRender Fix assertIngestDocument wrongfully passing (#31913) Remove unused reference to filePermissionsCache (#31923) rolling upgrade should use a replica to prevent relocations while running a scroll HLREST: Bundle the x-pack protocol project (#31904) Increase logging level for testStressMaybeFlush Added lenient flag for synonym token filter (#31484) [X-Pack] Beats centralized management: security role + licensing (#30520) HLRest: Move xPackInfo() to xPack().info() (#31905) Docs: add security delete role to api call table (#31907) [test] port archive distribution packaging tests (#31314) Watcher: Slack message empty text (#31596) [ML] Mute failing DetectionRulesIT.testCondition() test Fix broken NaN check in MovingFunctions#stdDev() (#31888) Date: Add DateFormatters class that uses java.time (#31856) [ML] Switch native QA tests to a 3 node cluster (#31757) Change trappy float comparison (#31889) Fix building AD URL from domain name (#31849) Add opaque_id to audit logging (#31878) re-enable backcompat tests add support for is_write_index in put-alias body parsing (#31674) Improve release notes script (#31833) [DOCS] Fix broken link in painless example Handle missing values in painless (#30975) Remove the ability to index or query context suggestions without context (#31007) Ingest: Enable Templated Fieldnames in Rename (#31690) [Docs] Fix typo in the Rollup API Quick Reference (#31855) Ingest: Add ignore_missing option to RemoveProc (#31693) Add template config for Beat state to X-Pack Monitoring (#31809) Watcher: Add ssl.trust email account setting (#31684) Remove link to oss-MSI (#31844) Painless: Restructure Definition/Whitelist (#31879) HLREST: Add x-pack-info API (#31870)
Throw an exception for `doc['field'].value` if this document is missing a value for the `field`. For 7.0: This is the default behaviour from 7.0 For 6.x: To enable this behavior from 6.x, a user can set a jvm.option: `-Des.script.exception_for_missing_value=true` on a node. If a user does not enable this behavior, a deprecation warning is logged on start up. Closes #29286
* 6.x: Fix rollup on date fields that don't support epoch_millis (#31890) Revert "Introduce a Hashing Processor (#31087)" (#32179) [test] use randomized runner in packaging tests (#32109) Painless: Fix caching bug and clean up addPainlessClass. (#32142) Fix BwC Tests looking for UUID Pre 6.4 (#32158) (#32169) Call setReferences() on custom referring tokenfilters in _analyze (#32157) Add more contexts to painless execute api (#30511) Add EC2 credential test for repository-s3 (#31918) Fix CP for namingConventions when gradle home has spaces (#31914) Convert Version to Java - clusterformation part1 (#32009) Fix Java 11 javadoc compile problem Improve docs for search preferences (#32098) Configurable password hashing algorithm/cost(#31234) (#32092) [DOCS] Update TLS on Docker for 6.3 ESIndexLevelReplicationTestCase doesn't support replicated failures but it's good to know what they are Switch distribution to new style Requests (#30595) Build: Skip jar tests if jar disabled Build: Move shadow customizations into common code (#32014) Painless: Add PainlessClassBuilder (#32141) Fix accidental duplication of bwc test for script behavior Handle missing values in painless (#30975) (#31903) Build: Make additional test deps of check (#32015) Painless: Fix Bug with Duplicate PainlessClasses (#32110) Adjust translog after versionType removed in 7.0 (#32020) Disable C2 from using AVX-512 on JDK 10 (#32138) [Rollup] Add new capabilities endpoint for concrete rollup indices (#32111) Mute :qa:mixed-cluster indices.stats/10_index/Index - all’ [ML] Wait for aliases in multi-node tests (#32086) Ensure to release translog snapshot in primary-replica resync (#32045) Docs: Fix missing example script quote (#32010) Add Index UUID to `/_stats` Response (#31871) (#32113) [ML] Move analyzer dependencies out of categorization config (#32123) [ML][DOCS] Add missing 6.3.0 release notes (#32099) Updates the build to gradle 4.9 (#32087) Update monitoring template version to 6040099 (#32088) Fix put mappings java API documentation (#31955) Add exclusion option to `keep_types` token filter (#32012)
Throw an exception for
doc['field'].valueif this document is missing a value for the
field.For 7.0:
This is the default behaviour from 7.0
For 6.x:
To enable this behavior from 6.x, a user can set a jvm.option:
-Des.script.exception_for_missing_value=trueon a node.If a user does not enable this behavior, a deprecation warning is logged on start up.
Motivation:
The reason for this change is to correct the current situation where missing values are arbitrary handled in scripts: some types were returning
null, some types were returningfalse,0or `epoch. We wanted to standardize it.Another reason was that to make it compatible how Lucene was handling missing values in DocValues. Lucene used to have a contract that NumericDocValuesshould return 0 when the document doesn’t have a value. But later a bitset of documents that had a value was added and now we have iterators of only docs with a value.
Closes #29286