Add preflight check to dynamic mapping updates#48817
Merged
DaveCTurner merged 8 commits intoelastic:masterfrom Nov 5, 2019
Merged
Add preflight check to dynamic mapping updates#48817DaveCTurner merged 8 commits intoelastic:masterfrom
DaveCTurner merged 8 commits intoelastic:masterfrom
Conversation
Today if the primary discovers that an indexing request needs a mapping update then it will send it to the master for validation and processing. If, however, the put-mapping request is invalid then the master still processes it as a (no-op) cluster state update. When there are a large number of indexing operations that result in invalid mapping updates this can overwhelm the master. However, the primary already has a reasonably up-to-date mapping against which it can check the (approximate) validity of the put-mapping request before sending it to the master. For instance it is not possible to remove fields in a mapping update, so if the primary detects that a mapping update will exceed the fields limit then it can reject it itself and avoid bothering the master. This commit adds a pre-flight check to the mapping update path so that the primary can discard obviously-invalid put-mapping requests itself. Fixes elastic#35564
Collaborator
|
Pinging @elastic/es-distributed (:Distributed/CRUD) |
Contributor
original-brownbear
left a comment
There was a problem hiding this comment.
One question on the implementation, but looks good in general :)
server/src/main/java/org/elasticsearch/action/bulk/MappingUpdatePerformer.java
Outdated
Show resolved
Hide resolved
original-brownbear
approved these changes
Nov 3, 2019
Contributor
original-brownbear
left a comment
There was a problem hiding this comment.
LGTM :) Thanks David!
ywelsch
approved these changes
Nov 4, 2019
| new CompressedXContent(result.getRequiredMappingUpdate(), XContentType.JSON, ToXContent.EMPTY_PARAMS), | ||
| MapperService.MergeReason.MAPPING_UPDATE_PREFLIGHT); | ||
| } catch (Exception e) { | ||
| logger.info("required mapping update failed during pre-flight check", e); |
Contributor
There was a problem hiding this comment.
can you log the index name here as well?
|
|
||
| try { | ||
| primary.mapperService().merge("_doc", | ||
| new CompressedXContent(result.getRequiredMappingUpdate(), XContentType.JSON, ToXContent.EMPTY_PARAMS), |
Contributor
There was a problem hiding this comment.
it's unfortunate that we have to compress it here, just to uncompress and parse it again in the next step.
Member
Author
There was a problem hiding this comment.
Yes, the interface to the MapperService is not what I expected.
DaveCTurner
added a commit
to DaveCTurner/elasticsearch
that referenced
this pull request
Nov 5, 2019
Today if the primary discovers that an indexing request needs a mapping update then it will send it to the master for validation and processing. If, however, the put-mapping request is invalid then the master still processes it as a (no-op) cluster state update. When there are a large number of indexing operations that result in invalid mapping updates this can overwhelm the master. However, the primary already has a reasonably up-to-date mapping against which it can check the (approximate) validity of the put-mapping request before sending it to the master. For instance it is not possible to remove fields in a mapping update, so if the primary detects that a mapping update will exceed the fields limit then it can reject it itself and avoid bothering the master. This commit adds a pre-flight check to the mapping update path so that the primary can discard obviously-invalid put-mapping requests itself. Fixes elastic#35564
DaveCTurner
added a commit
that referenced
this pull request
Nov 5, 2019
Today if the primary discovers that an indexing request needs a mapping update then it will send it to the master for validation and processing. If, however, the put-mapping request is invalid then the master still processes it as a (no-op) cluster state update. When there are a large number of indexing operations that result in invalid mapping updates this can overwhelm the master. However, the primary already has a reasonably up-to-date mapping against which it can check the (approximate) validity of the put-mapping request before sending it to the master. For instance it is not possible to remove fields in a mapping update, so if the primary detects that a mapping update will exceed the fields limit then it can reject it itself and avoid bothering the master. This commit adds a pre-flight check to the mapping update path so that the primary can discard obviously-invalid put-mapping requests itself. Fixes #35564 Backport of #48817
This was referenced Feb 3, 2020
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Today if the primary discovers that an indexing request needs a mapping update
then it will send it to the master for validation and processing. If, however,
the put-mapping request is invalid then the master still processes it as a
(no-op) cluster state update. When there are a large number of indexing
operations that result in invalid mapping updates this can overwhelm the
master.
However, the primary already has a reasonably up-to-date mapping against which
it can check the (approximate) validity of the put-mapping request before
sending it to the master. For instance it is not possible to remove fields in a
mapping update, so if the primary detects that a mapping update will exceed the
fields limit then it can reject it itself and avoid bothering the master.
This commit adds a pre-flight check to the mapping update path so that the
primary can discard obviously-invalid put-mapping requests itself.
Fixes #35564