Store source for nested objects#108818
Conversation
|
Hi @kkrik-es, I've created a changelog YAML for you. |
…ct' into fix/synthetic-source/nested-object
|
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
martijnvg
left a comment
There was a problem hiding this comment.
Looks good, I left two comments.
| } | ||
|
|
||
| if (context.parent().isNested()) { | ||
| if (context.mappingLookup().isSourceSynthetic() && context.getClonedSource() == false) { |
There was a problem hiding this comment.
Question about the added code here. Is the reason that this is needed in case there is an object field that is mapped as nested field? The array case is then covered by parseNonDynamicArray(...)?
There was a problem hiding this comment.
Right, just in case there's an entry not using an array. Should be rare but it can happen.
There was a problem hiding this comment.
In that case is the || mapper instanceof NestedObjectMapper addition at line 612 required?
There was a problem hiding this comment.
It is, that section covers the usual case where the nested object contains an array.
There was a problem hiding this comment.
If a nested object has already been added to ignored source, then I think we don't need to do anything for that json object contains an array? Since it has already been added to ignored source as part of the object?
There was a problem hiding this comment.
I believe we either get an array or an object, not an object wrapping an array. I verified this in the unit test, isn't that the valid syntax.
There was a problem hiding this comment.
I see without || mapper instanceof NestedObjectMapper we only store the first element of an array. So subsequent objects in an array would not get recorded.
There was a problem hiding this comment.
Maybe add this is as a comment?
| boolean isRoot = context.parent() instanceof RootObjectMapper; | ||
| context.addIgnoredField( | ||
| new IgnoredSourceFieldMapper.NameValue( | ||
| isRoot ? arrayFieldName : context.parent().name() + "." + arrayFieldName, | ||
| isRoot ? 0 : context.parent().fullPath().length() + 1, | ||
| XContentDataHelper.encodeXContentBuilder(tuple.v2()) | ||
| ) | ||
| ); |
There was a problem hiding this comment.
Maybe turn this logic into a helper method? I think this appears now several times in this file?
| token = context.parser().currentToken(); | ||
| parser = context.parser(); | ||
| } | ||
| context = context.createNestedContext((NestedObjectMapper) context.parent()); |
There was a problem hiding this comment.
Should the DocumentParserContext constructor that is used eventually from createNestedContext(...) method also copy clonedSource field from provided context?
There was a problem hiding this comment.
Good catch, this needs to be propagated.
| } | ||
|
|
||
| if (context.parent().isNested()) { | ||
| if (context.mappingLookup().isSourceSynthetic() && context.getClonedSource() == false) { |
There was a problem hiding this comment.
If a nested object has already been added to ignored source, then I think we don't need to do anything for that json object contains an array? Since it has already been added to ignored source as part of the object?
|
|
||
| boolean booleanValue = randomBoolean(); | ||
| int intValue = randomInt(); | ||
| var syntheticSource = syntheticSource(documentMapper, b -> { |
There was a problem hiding this comment.
Can we also assert the Lucene document that get created here? I think it can be useful to assert the actual IndexableField instances for _ignored_source field. Maybe the syntheticSource(...) method can optionally accept a Consumer<LuceneDocument> parameter, that allows to the test to assert the field instances?
There was a problem hiding this comment.
Hm not sure what would be the benefit, the contents are encoded. The decoded version should show up in the printed source that we already assert on.
There was a problem hiding this comment.
That is true, even if for some reason we end up with duplicate ignored stored fields. (duplicate array elements would be printed)
Covers nested objects with fields or arrays of objects. Adds support for tracking the source of arrays in disabled objects, too.
Related to #106825, #108417