[ML] fix random sampling background query consistency#83676
Merged
benwtrent merged 1 commit intoelastic:masterfrom Feb 8, 2022
Merged
[ML] fix random sampling background query consistency#83676benwtrent merged 1 commit intoelastic:masterfrom
benwtrent merged 1 commit intoelastic:masterfrom
Conversation
Collaborator
|
Pinging @elastic/ml-core (Team:ML) |
Collaborator
|
Pinging @elastic/es-analytics-geo (Team:Analytics) |
tveasey
approved these changes
Feb 8, 2022
Contributor
tveasey
left a comment
There was a problem hiding this comment.
LGTM (the main thing is we still ensure independent samples per shard which seeding with hash arranges for us).
weizijun
added a commit
to weizijun/elasticsearch
that referenced
this pull request
Feb 9, 2022
* upstream/master: (166 commits) Bind host all instead of just _site_ when needed (elastic#83145) [DOCS] Fix min/max agg snippets for histograms (elastic#83695) [DOCS] Add deprecation notice for system indices (elastic#83688) Cache ILM policy name on IndexMetadata (elastic#83603) [DOCS] Fix 8.0 breaking changes sort order (elastic#83685) [ML] fix random sampling background query consistency (elastic#83676) Move internal APIs into their own namespace '_internal' Runtime fields core-with-mapped tests support tsdb (elastic#83577) Optimize calculating the presence of a quorum (elastic#83638) Use switch expressions in EnableAllocationDecider and NodeShutdownAllocationDecider (elastic#83641) Note libffi error message in tmpdir docs (elastic#83662) Fix TransportDesiredNodesActionsIT batch tests (elastic#83659) [DOCS] Remove unused upgrade doc files (elastic#83617) [ML] Wait for model process to stop in stop deployment (elastic#83644) [ML] Fix submit after shutdown in process worker service (elastic#83645) Remove req/resp classes associated with HLRC (elastic#83599) Introduce index.version.compatibility setting (elastic#83264) Rename InternalTestCluster#getMasterNodeInstance (elastic#83407) Mute TimeSeriesIndexSearcherTests testCollectInOrderAcrossSegments (elastic#83648) Add rollover add max_primary_shard_docs condition (elastic#80981) ... # Conflicts: # x-pack/plugin/rollup/build.gradle # x-pack/plugin/rollup/src/test/java/org/elasticsearch/xpack/rollup/v2/RollupActionSingleNodeTests.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
There was a consistency bug where the documents returned by the created scorer could change while looking at the same shard. This can occur if multiple weights are created from the same query.
For scenarios like Significant Terms/Text, we need a consistent view of each shard when using the same probability and seed.
This commit ensures this by creating a new random value supplier seeded by the shard hash & seed.