Skip to content

Stop and relocate sliced reindex on shutdown#143183

Merged
szybia merged 11 commits intoelastic:mainfrom
szybia:sliced-reindex-relocations
Mar 3, 2026
Merged

Stop and relocate sliced reindex on shutdown#143183
szybia merged 11 commits intoelastic:mainfrom
szybia:sliced-reindex-relocations

Conversation

@szybia
Copy link
Copy Markdown
Contributor

@szybia szybia commented Feb 26, 2026

  • Relocate sliced reindex tasks on shutdown (currently behind feature flag)
  • Add IT and refactor ReindexRelocationIT to not duplicate code

Closes https://github.com/elastic/elasticsearch-team/issues/2292

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends reindex relocation-on-shutdown support to sliced reindex by having slice workers stop with per-slice resume info, then letting the leader relocate/resume the whole task.

Changes:

  • Add leader-state logic to aggregate per-slice ResumeInfo and trigger relocation when any slice is resumable.
  • Propagate relocation eligibility to sliced ReindexRequest instances and wire TaskManager into Reindexer to let slice workers consult the leader’s relocation node choice.
  • Expand/adjust unit + integration tests to cover local sliced/unsliced and remote unsliced relocation flows.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
server/src/test/java/org/elasticsearch/index/reindex/LeaderBulkByScrollTaskStateTests.java Adds unit tests validating aggregated per-slice resume info behavior for leader relocation.
server/src/main/java/org/elasticsearch/index/reindex/ReindexRequest.java Ensures relocation eligibility is preserved when creating per-slice requests.
server/src/main/java/org/elasticsearch/index/reindex/LeaderBulkByScrollTaskState.java Adds relocation supplier plumbing + aggregation of slice resume info into a leader-level carrier response.
modules/reindex/src/test/java/org/elasticsearch/reindex/ReindexerTests.java Updates test wiring for new TaskManager dependency.
modules/reindex/src/main/java/org/elasticsearch/reindex/TransportReindexAction.java Passes TaskManager into Reindexer.
modules/reindex/src/main/java/org/elasticsearch/reindex/Reindexer.java Implements leader/worker relocation supplier setup for sliced tasks; leader relocation handling now supported.
modules/reindex/src/main/java/org/elasticsearch/reindex/AbstractAsyncBulkByScrollAction.java Allows relocation stop/resume-info emission for sliced workers and ensures response status is accurate for leader aggregation.
modules/reindex-management/src/internalClusterTest/java/org/elasticsearch/reindex/management/ReindexRelocationIT.java Refactors/extends IT coverage to include sliced local relocation and clarifies expectations for leader slice status reporting.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…cations

* upstream/main: (35 commits)
  Create ARM bulk sqrI8 implementation (elastic#142461)
  Rework get-snapshots predicates (elastic#143161)
  Refactor downsampling fetchers and producers (elastic#140357)
  ESQL: Unmute test and add extra logging to generative test validation (elastic#143168)
  Fix metadata fields being nullified/loaded by unmapped_fields setting (elastic#143155)
  Determine remote cluster version (elastic#142494)
  Populate failure message for aborted clones (elastic#143206)
  Allow kibana_system role to read and manage logs streams (elastic#143053)
  Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:eval.DocsLength} elastic#143224
  Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:eval.DocsByteLength} elastic#143223
  Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:docs.DocsBitLength} elastic#143222
  Fix FloatVectorScorerSupplier bulkScore bug (elastic#143211)
  ESQL: Add data node execution for external sources (elastic#143209)
  [ESQL] Cleanup commands docs (elastic#143058)
  [ML]Fix latest transforms disregarding updates when sort and sync fields are non-monotonic (elastic#142856)
  Mute org.elasticsearch.index.mapper.IpFieldMapperTests testSyntheticSourceInObject elastic#143212
  Tests: Fix StoreDirectoryMetricsIT (elastic#143084)
  ESQL: Add distribution strategy for external sources (elastic#143194)
  CSV IT spec (elastic#142585)
  Fix VectorScorerOSQBenchmark.score to read corrections properly (elastic#143137)
  ...
…cations

* upstream/main:
  Warn on API key version mismatch (elastic#143127)
  Fixed wrong malformed value ordering in synthetic source tests (elastic#143187)
  [ML] Fix: required_native_memory_bytes Calculated with Wrong Allocation Count (elastic#143077)
  Add configureBenchmarkLogging calls across the various benchmarks (elastic#143185)
  Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:k8s-timeseries-avg-over-time.Avg_over_time_aggregate_metric_double_implicit_casting} elastic#143292
  Give system role permission to invoke shard refresh (elastic#143190)
  Mute testSyntheticSourceWithTranslogSnapshot (elastic#143260)
  Adds ResumeInfo Tests (elastic#142769)
  Use a static method to configure benchmark logging (elastic#143056)
  add connectors release notes (elastic#142884)
  Add CI triage guidance for AI agents (elastic#142994)
  ESQL: Data sources: ZSTD, BZIP2 (elastic#143228)
  [ES|QL] Channels issue when an agg is called with the same field (elastic#142180) (elastic#142269)
  Add support for project routing in reindex requests (elastic#142240)
…cations

* upstream/main: (60 commits)
  Use batches for other bulk vector benchmarks (elastic#143167)
  Mute org.elasticsearch.xpack.esql.qa.mixed.MixedClusterEsqlSpecIT test {csv-spec:lookup-join.MvJoinKeyOnTheLookupIndexAfterStats} elastic#143388
  Mute org.elasticsearch.snapshots.ConcurrentSnapshotsIT testBackToBackQueuedDeletes elastic#143387
  [Inference API] Parse endpoint metadata from persisted endpoints (elastic#143081)
  Add cluster formation doc to DistributedArchitectureGuide (elastic#143318)
  Fix flattened root block loader null expectation (elastic#143238)
  Unmute ValueSourceReaderTypeConversionTests testLoadAll (elastic#143189)
  ESQL: Add split coalescing for many small files (elastic#143335)
  Unmute mixed-cluster spatial parse warning test (elastic#143186)
  Fix zero-size estimate in BytesRefBlock null test (elastic#143258)
  Make DataType and DataFormat top-level enums (elastic#143312)
  Add support for steps to change the target index name for later steps (elastic#142955)
  Set mayContainDuplicates flag to test deduplication (elastic#143375)
  ESQL: Fix Driver search load millis as nanos bug (elastic#143267)
  Mute org.elasticsearch.xpack.esql.qa.mixed.MixedClusterEsqlSpecIT test {csv-spec:lookup-join.LookupJoinWithMixPushableAndUnpushableFilters} elastic#143378
  ESQL: Forbid MV_EXPAND before full text functions (elastic#143249)
  ESQL: Fix unresolved name pattern (elastic#143210)
  Implement boxplot queryDSL aggregation for exponential_histograms (elastic#143026)
  Add prefetching to x64 bulk vector implementations (elastic#142387)
  Make large segment vector tests resilient to memory constraints (elastic#143366)
  ...
@szybia szybia added >non-issue :Distributed/Reindex Issues relating to reindex that are not caused by issues further down labels Mar 2, 2026
@szybia szybia marked this pull request as ready for review March 2, 2026 19:17
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@elasticsearchmachine elasticsearchmachine added the Team:Distributed Meta label for distributed team. label Mar 2, 2026
@szybia szybia requested a review from Copilot March 2, 2026 19:18
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

@samxbr samxbr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good, I just have some small-ish comments.

szybia added 4 commits March 3, 2026 13:32
…cations

* upstream/main: (56 commits)
  Mute org.elasticsearch.compute.lucene.read.ValueSourceReaderTypeConversionTests testLoadAll elastic#143471
  [DOCS] Fix ES|QL function and commands lists versioning metadata (elastic#143402)
  Fix MMROperatorTests (elastic#143453)
  Fix CSV-escaped quotes in generated docs examples (elastic#143449)
  Fix SQL client parsing of array header values (elastic#143408)
  ESQL: Add extended distribution tests and fault injection for external sources (elastic#143420)
  ESQL: Fix datasource test failures on Windows and FIPS (elastic#143417)
  Add circuit breaker for query construction to prevent OOM from automaton-based queries (elastic#142150)
  Cleanup SpecIT logging configuration (elastic#143365)
  ESQL: Prune unused regex extract nodes in optimizer (elastic#140982)
  Ensure supported locale outside of Entitlements check (elastic#143405)
  feat(es|ql): add dense_vector support in coalesce (elastic#142974)
  [Test] Unmute SnapshotStressTestsIT (elastic#143359)
  Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:lookup-join.LookupJoinWithCoalesceFilterOnRight} elastic#143443
  Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:lookup-join.MvJoinKeyOnTheLookupIndex} elastic#143442
  ESQL: Fix CCS exchange sink cleanup (elastic#143325)
  Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:lookup-join.MvJoinKeyOnTheLookupIndexAfterStats} elastic#143434
  Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:lookup-join.MvJoinKeyFromRow} elastic#143432
  Mute org.elasticsearch.xpack.esql.qa.mixed.MixedClusterEsqlSpecIT test {csv-spec:k8s-timeseries.Datenanos_derivative_compared_to_rate} elastic#143431
  Mute org.elasticsearch.multiproject.test.CoreWithMultipleProjectsClientYamlTestSuiteIT test {yaml=search.retrievers/result-diversification/10_mmr_result_diversification_retriever/Test MMR result diversification single index float type} elastic#143430
  ...
@szybia szybia requested a review from samxbr March 3, 2026 14:16
Copy link
Copy Markdown
Contributor

@samxbr samxbr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@szybia szybia merged commit e4e76ef into elastic:main Mar 3, 2026
35 checks passed
@szybia szybia deleted the sliced-reindex-relocations branch March 3, 2026 18:13
GalLalouche pushed a commit to GalLalouche/elasticsearch that referenced this pull request Mar 3, 2026
szybia added a commit to szybia/elasticsearch that referenced this pull request Mar 3, 2026
…locations

* upstream/main: (51 commits)
  ESQL: Remaining serialization tests (elastic#143470)
  Eagerly release resources in `TransportAwaitClusterStateVersionAppliedAction` (elastic#143477)
  Stop and relocate sliced reindex on shutdown (elastic#143183)
  Documentation for query_vector base64 parameter (elastic#142675)
  ES|QL: Fix LIMIT after all columns are dropped (elastic#143463)
  Update docs-build.yml (elastic#142958)
  Fix KnnIndexTester to work with byte vectors (elastic#143493)
  Fix IndexInputUtils.withSlice to produce native-safe MemorySegments on Java 21 (elastic#143479)
  CPS fix: include only relevant projects in the search response metadata (elastic#143367)
  apm-data: explicit map of timestamp.us to long (elastic#143173)
  [Inference API] Add custom headers for Azure OpenAI Service (elastic#142969)
  ESQL: Add name IDs to golden tests and fix synthetic names (elastic#143450)
  Add getUnavailableShards to BaseBroadcastResponse (elastic#143406)
  Add description to reindex API without sensitive info (elastic#143112)
  SQL: fix CLI tests (elastic#143451)
  ES|QL: Add note of future removal of FORK implicit LIMIT (elastic#143457)
  [Test] Randomly disable doc values skippers in time-series indices (elastic#143389)
  Improve pattern text downgrade license test (elastic#143102)
  [Transform] Stop transforms at the end of tests (elastic#139783)
  Mute org.elasticsearch.compute.lucene.read.ValueSourceReaderTypeConversionTests testLoadAll elastic#143471
  ...
shmuelhanoch pushed a commit to shmuelhanoch/elasticsearch that referenced this pull request Mar 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed/Reindex Issues relating to reindex that are not caused by issues further down >non-issue Team:Distributed Meta label for distributed team. v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants