Track source for arrays of objects#108417
Conversation
|
Hi @kkrik-es, I've created a changelog YAML for you. |
… fix/synthetic-source/array
# Conflicts: # server/src/main/java/org/elasticsearch/index/mapper/ObjectMapper.java
| XContentParserConfiguration configuration, | ||
| XContentBuilder builder | ||
| ) throws IOException { | ||
| DocumentParserContext subcontext = context.switchParser( |
There was a problem hiding this comment.
This is a new usage of the switchParser() method. However it is currently deprecated. I think if we like to use then we should un-deprecate this method. I think this new usage is a valid use case. cc: @javanna
There was a problem hiding this comment.
Looking at the concerns we had with switchParser, i think your assessment is right. Also, its javadocs says
we are actively deprecating and removing the ability to pass complex objects to multifields, so try and avoid using this method
Given that this was committed 4 years ago, the "actively" part no longer holds true. I reviewed the current usages of switchParser and they seem ok to me. I have a slight concern that we are now using the method in a general utility method, that may spread its usage instead of limiting it to the only places where it's strictly needed. I am not sure what to do about that though, and I am ok with the current plan. Thanks for raising this.
… fix/synthetic-source/array
server/src/main/java/org/elasticsearch/index/mapper/DocumentParserContext.java
Outdated
Show resolved
Hide resolved
| hasValue = false; | ||
| if (ignoredValues != null) { | ||
| for (IgnoredSourceFieldMapper.NameValue ignored : ignoredValues) { | ||
| b.field(ignored.getFieldName()); |
There was a problem hiding this comment.
I didn't think about this before - doesn't this break the alphabetic sorting of fields in the source? We say in documentation that it is sorted.
There was a problem hiding this comment.
This is a good point. I'll work on a follow-up PR to restore ordering.
There was a problem hiding this comment.
Well, I gave it a try here. While it's easy to order fields at the object level, we don't do so for the contents of the object array that get stored and retrieved verbatim. I added tests for this, not sure if there's an easy way to fix this..
There was a problem hiding this comment.
I think that's exactly what we need, we only specify ordering of fields.
I was thinking of keeping it experimental initially, until Observability provides feedback on its use. @felixbarny fyi. |
martijnvg
left a comment
There was a problem hiding this comment.
Left a few more questions and comments.
server/src/main/java/org/elasticsearch/index/mapper/XContentDataHelper.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/DocumentParserContext.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/DocumentParserContext.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/XContentDataHelper.java
Show resolved
Hide resolved
|
The |
This PR uses option
store_array_sourcefor objects to track the source of (sub)object arrays in synthetic mode. This allows preserving the original source, as synthetic mode merges, deduplicates and sorts array entries by default. The downside is that the whole object array (including children objects and fields) gets stored twice, while there's additional overhead while synthesizing source at query time.This functionality will be experimental initially, to get some experience before documenting and opening up its use.
Fixes #90708