You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This meta issue tracks some of the improvements and cleanups that we identified while working on the runtime fields project (#59332):
Remove the last usage of SearchContext#mapperService: it is used for nested objects
Remove QueryShardContext#getObjectMapper : it is mostly used for nested objects and it indirectly exposes retrieving MappedFieldType going through FieldMappers, which are not aware of runtime fields
Clean up SearchLookup handling in SearchExecutionContext: a separate lookup object needs to be created for the fetch phase which is passed through to MappedFieldFype#valueFetcher, yet the (wrong) lookup can also be retrieved directly from the SearchExecutionContext
Revise NestedScope on SearchExecutionContext
Clean up special handling for allowUnmappedFields and mapUnmappedAsText in SearchExecutionContext, they can be changed from consumers which seems dangerous and are only needed for the percolator.
Review null checks on what SearchExecutionContext#getFieldType returns. They most likely need to be converted to use QueryShardCOntext#isFieldMapped instead which does not take the allowUnmappedField and mapUnmappedAsString flags into account
Review calls to FieldsVisitor#postProcess and make sure that they are all necessary, in some cases we may be able to skip the call as it does nothing depending on which fields we are processing
Currently, when an index is created without providing mappings it will have a null document mapper until mappings are provided for it, or a document has indexed in it. MapperService exposes the DocumentMapper through which the mappings can be accessed. The fact that DocumentMapper can be null leads to null checks in many places and potential NPEs in callers. We'd like to eagerly create a document mapper for a new index, as soon as it gets created. This has the implication that we would lose the semantics around the fact that when a document mapper is null for an index, it means the index is empty.
That would also allow us to apply index sorting when the index is created instead of delaying it to the document mapper creation.
We have a special case to handle the case of an empty doc being indexed, which triggers a dynamic mapping update although the mappings are empty, to ensure that whenever a doc is registered in an index, it has mappings and its corresponding document mapper will not be null then. This special case can be removed if we make sure that every index always has mappings
Mapping is visible from outside of its package but all of its methods are called from within the same package. Review usages and possibly adjust methods visibility accordingly. Also, its instance fields are all package private and accessed directly without going through getters, yet the root object mapper has also a public getter that is only used within the same package. (Remove redundant methods from DocumentMapper #69803)
Make ContentPath less confusing
Rework ObjectMapper and RootObjectMapper to not rely on clone
Make MappedFieldType responsible for building their own field caps objects, and add in parent/nested information
Reduce the number of merge methods on MapperService, and clarify when they should be called
Merging does not require the entire MapperService: ideally all of the merge methods exposed by MapperService can be factored out to a lightweight MappingMerger component (merge itself was already removed from DocumentMapper and replaced by calling Mapping#merge directly). This means that it will no longer be required to build a full MapperService in the master node to execute merges when the indices are not allocated to it.
Can we remove getMapper and fieldMappers from MappingLookup? Why would one have to go through mappers by name? Can we consolidate how to access fields in the mappings? What needs access to mappers and objectmappers? Can we have a unified lookup structure?
Do we even need a DocumentMapper given that MappingLookup became almost a copy of it? We should probably use MappingLookup whenever possible and re-evaluate what we need DocumentMapper for.\
Rewrite tests (e.g. MapperServiceTestCase) to rely less, or not at all, on DocumentMapper. Many tests build a MapperService only to use a small portion of it, or merge mappings, or retrieve document mapper from it.
Caching cleanups
Remove MapperService from SearchExecutionContext entirely. We'd prefer there not be an easy way for the context to get a new snapshot of the mapping. OTOH we don't really want to merge all of the immutable mapping stuff into MappingLookup. We have to decide how to remove it.
Remove SearchContext's reference to MapperService. This includes exposing NestedDocuments in SearchExecutionContext so that DefaultSearchContext takes it from there. And cleaning up access to indexService.mapperService() in DefaultSearchContext
Replace or remove the parsing function passed to MappingLookup, which hides a reference to DocumentMapper.
Pipe all volatile read methods on MapperService through #mappingLookup() to make sure we perform a single volatile read. Or we could remove them entirely in favor the caller going through mappingLookup(). If the caller keeps the mappingLookup for the duration of the operation then the caller will get a consistent snapshot of the mapping during the entire process. This can potentially become a big change but it can be done in steps, for instance we can start from duplicated methods that are exposed in MapperService and DocumentMapper (e.g. simpleMatchToFullName), that could be rather accessed through retrieving MappingLookup.
MapperService#getObjectMapper is only used in tests, it can probably be replaced by MappingLookup#getObjectMapper
This meta issue tracks some of the improvements and cleanups that we identified while working on the runtime fields project (#59332):
SearchContext#mapperService: it is used for nested objectsQueryShardContext#getObjectMapper: it is mostly used for nested objects and it indirectly exposes retrievingMappedFieldTypegoing throughFieldMappers, which are not aware of runtime fieldsSearchLookuphandling inSearchExecutionContext: a separate lookup object needs to be created for the fetch phase which is passed through toMappedFieldFype#valueFetcher, yet the (wrong) lookup can also be retrieved directly from theSearchExecutionContextNestedScopeonSearchExecutionContextallowUnmappedFieldsandmapUnmappedAsTextinSearchExecutionContext, they can be changed from consumers which seems dangerous and are only needed for the percolator.SearchExecutionContext#getFieldTypereturns. They most likely need to be converted to useQueryShardCOntext#isFieldMappedinstead which does not take theallowUnmappedFieldandmapUnmappedAsStringflags into accountFieldsVisitor#postProcessand make sure that they are all necessary, in some cases we may be able to skip the call as it does nothing depending on which fields we are processingMapperServiceexposes theDocumentMapperthrough which the mappings can be accessed. The fact thatDocumentMappercan be null leads to null checks in many places and potential NPEs in callers. We'd like to eagerly create a document mapper for a new index, as soon as it gets created. This has the implication that we would lose the semantics around the fact that when a document mapper is null for an index, it means the index is empty.That would also allow us to apply index sorting when the index is created instead of delaying it to the document mapper creation.
MapperRegistrytoindex.mapperpackage? (Move MapperRegistry to index.mapper package #69805)Mappingis visible from outside of its package but all of its methods are called from within the same package. Review usages and possibly adjust methods visibility accordingly. Also, its instance fields are all package private and accessed directly without going through getters, yet the root object mapper has also a public getter that is only used within the same package. (Remove redundant methods from DocumentMapper #69803)ContentPathless confusingObjectMapperandRootObjectMapperto not rely onclonemergemethods onMapperService, and clarify when they should be calledMapperService: ideally all of the merge methods exposed byMapperServicecan be factored out to a lightweightMappingMergercomponent (merge itself was already removed fromDocumentMapperand replaced by callingMapping#mergedirectly). This means that it will no longer be required to build a fullMapperServicein the master node to execute merges when the indices are not allocated to it.NestedDocumentsto depend onMappingLookupUse a mapping snapshot for fetching nested docs #66877 (@nik9000)Caching cleanups
MapperServicefromSearchExecutionContextentirely. We'd prefer there not be an easy way for the context to get a new snapshot of the mapping. OTOH we don't really want to merge all of the immutable mapping stuff intoMappingLookup. We have to decide how to remove it.SearchContext's reference toMapperService. This includes exposingNestedDocumentsinSearchExecutionContextso thatDefaultSearchContexttakes it from there. And cleaning up access toindexService.mapperService()inDefaultSearchContextMappingLookup, which hides a reference toDocumentMapper.MapperServicethrough#mappingLookup()to make sure we perform a single volatile read. Or we could remove them entirely in favor the caller going throughmappingLookup(). If the caller keeps themappingLookupfor the duration of the operation then the caller will get a consistent snapshot of the mapping during the entire process. This can potentially become a big change but it can be done in steps, for instance we can start from duplicated methods that are exposed inMapperServiceandDocumentMapper(e.g.simpleMatchToFullName), that could be rather accessed through retrievingMappingLookup.MapperService#getObjectMapperis only used in tests, it can probably be replaced byMappingLookup#getObjectMapperFieldMappers"obviously" immutable.