Stop and relocate sliced reindex on shutdown#143183
Conversation
There was a problem hiding this comment.
Pull request overview
This PR extends reindex relocation-on-shutdown support to sliced reindex by having slice workers stop with per-slice resume info, then letting the leader relocate/resume the whole task.
Changes:
- Add leader-state logic to aggregate per-slice
ResumeInfoand trigger relocation when any slice is resumable. - Propagate relocation eligibility to sliced
ReindexRequestinstances and wireTaskManagerintoReindexerto let slice workers consult the leader’s relocation node choice. - Expand/adjust unit + integration tests to cover local sliced/unsliced and remote unsliced relocation flows.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| server/src/test/java/org/elasticsearch/index/reindex/LeaderBulkByScrollTaskStateTests.java | Adds unit tests validating aggregated per-slice resume info behavior for leader relocation. |
| server/src/main/java/org/elasticsearch/index/reindex/ReindexRequest.java | Ensures relocation eligibility is preserved when creating per-slice requests. |
| server/src/main/java/org/elasticsearch/index/reindex/LeaderBulkByScrollTaskState.java | Adds relocation supplier plumbing + aggregation of slice resume info into a leader-level carrier response. |
| modules/reindex/src/test/java/org/elasticsearch/reindex/ReindexerTests.java | Updates test wiring for new TaskManager dependency. |
| modules/reindex/src/main/java/org/elasticsearch/reindex/TransportReindexAction.java | Passes TaskManager into Reindexer. |
| modules/reindex/src/main/java/org/elasticsearch/reindex/Reindexer.java | Implements leader/worker relocation supplier setup for sliced tasks; leader relocation handling now supported. |
| modules/reindex/src/main/java/org/elasticsearch/reindex/AbstractAsyncBulkByScrollAction.java | Allows relocation stop/resume-info emission for sliced workers and ensures response status is accurate for leader aggregation. |
| modules/reindex-management/src/internalClusterTest/java/org/elasticsearch/reindex/management/ReindexRelocationIT.java | Refactors/extends IT coverage to include sliced local relocation and clarifies expectations for leader slice status reporting. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
modules/reindex/src/main/java/org/elasticsearch/reindex/Reindexer.java
Outdated
Show resolved
Hide resolved
modules/reindex/src/main/java/org/elasticsearch/reindex/Reindexer.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/reindex/LeaderBulkByScrollTaskState.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/reindex/LeaderBulkByScrollTaskState.java
Outdated
Show resolved
Hide resolved
…cations * upstream/main: (35 commits) Create ARM bulk sqrI8 implementation (elastic#142461) Rework get-snapshots predicates (elastic#143161) Refactor downsampling fetchers and producers (elastic#140357) ESQL: Unmute test and add extra logging to generative test validation (elastic#143168) Fix metadata fields being nullified/loaded by unmapped_fields setting (elastic#143155) Determine remote cluster version (elastic#142494) Populate failure message for aborted clones (elastic#143206) Allow kibana_system role to read and manage logs streams (elastic#143053) Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:eval.DocsLength} elastic#143224 Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:eval.DocsByteLength} elastic#143223 Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:docs.DocsBitLength} elastic#143222 Fix FloatVectorScorerSupplier bulkScore bug (elastic#143211) ESQL: Add data node execution for external sources (elastic#143209) [ESQL] Cleanup commands docs (elastic#143058) [ML]Fix latest transforms disregarding updates when sort and sync fields are non-monotonic (elastic#142856) Mute org.elasticsearch.index.mapper.IpFieldMapperTests testSyntheticSourceInObject elastic#143212 Tests: Fix StoreDirectoryMetricsIT (elastic#143084) ESQL: Add distribution strategy for external sources (elastic#143194) CSV IT spec (elastic#142585) Fix VectorScorerOSQBenchmark.score to read corrections properly (elastic#143137) ...
…cations * upstream/main: Warn on API key version mismatch (elastic#143127) Fixed wrong malformed value ordering in synthetic source tests (elastic#143187) [ML] Fix: required_native_memory_bytes Calculated with Wrong Allocation Count (elastic#143077) Add configureBenchmarkLogging calls across the various benchmarks (elastic#143185) Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:k8s-timeseries-avg-over-time.Avg_over_time_aggregate_metric_double_implicit_casting} elastic#143292 Give system role permission to invoke shard refresh (elastic#143190) Mute testSyntheticSourceWithTranslogSnapshot (elastic#143260) Adds ResumeInfo Tests (elastic#142769) Use a static method to configure benchmark logging (elastic#143056) add connectors release notes (elastic#142884) Add CI triage guidance for AI agents (elastic#142994) ESQL: Data sources: ZSTD, BZIP2 (elastic#143228) [ES|QL] Channels issue when an agg is called with the same field (elastic#142180) (elastic#142269) Add support for project routing in reindex requests (elastic#142240)
…cations * upstream/main: (60 commits) Use batches for other bulk vector benchmarks (elastic#143167) Mute org.elasticsearch.xpack.esql.qa.mixed.MixedClusterEsqlSpecIT test {csv-spec:lookup-join.MvJoinKeyOnTheLookupIndexAfterStats} elastic#143388 Mute org.elasticsearch.snapshots.ConcurrentSnapshotsIT testBackToBackQueuedDeletes elastic#143387 [Inference API] Parse endpoint metadata from persisted endpoints (elastic#143081) Add cluster formation doc to DistributedArchitectureGuide (elastic#143318) Fix flattened root block loader null expectation (elastic#143238) Unmute ValueSourceReaderTypeConversionTests testLoadAll (elastic#143189) ESQL: Add split coalescing for many small files (elastic#143335) Unmute mixed-cluster spatial parse warning test (elastic#143186) Fix zero-size estimate in BytesRefBlock null test (elastic#143258) Make DataType and DataFormat top-level enums (elastic#143312) Add support for steps to change the target index name for later steps (elastic#142955) Set mayContainDuplicates flag to test deduplication (elastic#143375) ESQL: Fix Driver search load millis as nanos bug (elastic#143267) Mute org.elasticsearch.xpack.esql.qa.mixed.MixedClusterEsqlSpecIT test {csv-spec:lookup-join.LookupJoinWithMixPushableAndUnpushableFilters} elastic#143378 ESQL: Forbid MV_EXPAND before full text functions (elastic#143249) ESQL: Fix unresolved name pattern (elastic#143210) Implement boxplot queryDSL aggregation for exponential_histograms (elastic#143026) Add prefetching to x64 bulk vector implementations (elastic#142387) Make large segment vector tests resilient to memory constraints (elastic#143366) ...
...t/src/internalClusterTest/java/org/elasticsearch/reindex/management/ReindexRelocationIT.java
Show resolved
Hide resolved
|
Pinging @elastic/es-distributed (Team:Distributed) |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
server/src/main/java/org/elasticsearch/index/reindex/LeaderBulkByScrollTaskState.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/reindex/WorkerBulkByScrollTaskState.java
Show resolved
Hide resolved
samxbr
left a comment
There was a problem hiding this comment.
Mostly looks good, I just have some small-ish comments.
modules/reindex/src/main/java/org/elasticsearch/reindex/Reindexer.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/reindex/LeaderBulkByScrollTaskState.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/reindex/LeaderBulkByScrollTaskState.java
Show resolved
Hide resolved
...t/src/internalClusterTest/java/org/elasticsearch/reindex/management/ReindexRelocationIT.java
Show resolved
Hide resolved
...t/src/internalClusterTest/java/org/elasticsearch/reindex/management/ReindexRelocationIT.java
Outdated
Show resolved
Hide resolved
modules/reindex/src/main/java/org/elasticsearch/reindex/TransportReindexAction.java
Outdated
Show resolved
Hide resolved
...t/src/internalClusterTest/java/org/elasticsearch/reindex/management/ReindexRelocationIT.java
Outdated
Show resolved
Hide resolved
...t/src/internalClusterTest/java/org/elasticsearch/reindex/management/ReindexRelocationIT.java
Show resolved
Hide resolved
…cations * upstream/main: (56 commits) Mute org.elasticsearch.compute.lucene.read.ValueSourceReaderTypeConversionTests testLoadAll elastic#143471 [DOCS] Fix ES|QL function and commands lists versioning metadata (elastic#143402) Fix MMROperatorTests (elastic#143453) Fix CSV-escaped quotes in generated docs examples (elastic#143449) Fix SQL client parsing of array header values (elastic#143408) ESQL: Add extended distribution tests and fault injection for external sources (elastic#143420) ESQL: Fix datasource test failures on Windows and FIPS (elastic#143417) Add circuit breaker for query construction to prevent OOM from automaton-based queries (elastic#142150) Cleanup SpecIT logging configuration (elastic#143365) ESQL: Prune unused regex extract nodes in optimizer (elastic#140982) Ensure supported locale outside of Entitlements check (elastic#143405) feat(es|ql): add dense_vector support in coalesce (elastic#142974) [Test] Unmute SnapshotStressTestsIT (elastic#143359) Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:lookup-join.LookupJoinWithCoalesceFilterOnRight} elastic#143443 Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:lookup-join.MvJoinKeyOnTheLookupIndex} elastic#143442 ESQL: Fix CCS exchange sink cleanup (elastic#143325) Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:lookup-join.MvJoinKeyOnTheLookupIndexAfterStats} elastic#143434 Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:lookup-join.MvJoinKeyFromRow} elastic#143432 Mute org.elasticsearch.xpack.esql.qa.mixed.MixedClusterEsqlSpecIT test {csv-spec:k8s-timeseries.Datenanos_derivative_compared_to_rate} elastic#143431 Mute org.elasticsearch.multiproject.test.CoreWithMultipleProjectsClientYamlTestSuiteIT test {yaml=search.retrievers/result-diversification/10_mmr_result_diversification_retriever/Test MMR result diversification single index float type} elastic#143430 ...
…locations * upstream/main: (51 commits) ESQL: Remaining serialization tests (elastic#143470) Eagerly release resources in `TransportAwaitClusterStateVersionAppliedAction` (elastic#143477) Stop and relocate sliced reindex on shutdown (elastic#143183) Documentation for query_vector base64 parameter (elastic#142675) ES|QL: Fix LIMIT after all columns are dropped (elastic#143463) Update docs-build.yml (elastic#142958) Fix KnnIndexTester to work with byte vectors (elastic#143493) Fix IndexInputUtils.withSlice to produce native-safe MemorySegments on Java 21 (elastic#143479) CPS fix: include only relevant projects in the search response metadata (elastic#143367) apm-data: explicit map of timestamp.us to long (elastic#143173) [Inference API] Add custom headers for Azure OpenAI Service (elastic#142969) ESQL: Add name IDs to golden tests and fix synthetic names (elastic#143450) Add getUnavailableShards to BaseBroadcastResponse (elastic#143406) Add description to reindex API without sensitive info (elastic#143112) SQL: fix CLI tests (elastic#143451) ES|QL: Add note of future removal of FORK implicit LIMIT (elastic#143457) [Test] Randomly disable doc values skippers in time-series indices (elastic#143389) Improve pattern text downgrade license test (elastic#143102) [Transform] Stop transforms at the end of tests (elastic#139783) Mute org.elasticsearch.compute.lucene.read.ValueSourceReaderTypeConversionTests testLoadAll elastic#143471 ...
Closes https://github.com/elastic/elasticsearch-team/issues/2292