bug fix for Restore with Index Sort (#20073)#20284
bug fix for Restore with Index Sort (#20073)#20284andrross merged 5 commits intoopensearch-project:mainfrom
Conversation
WalkthroughPruning APIs were updated to accept an explicit boolean to enable a Lucene parent field. Changes
Sequence DiagramsequenceDiagram
participant Caller
participant Engine
participant Store
participant Lucene
Caller->>Engine: request cleanUpUnreferencedFiles(...)
Engine->>Store: shouldSetParentField()
Store-->>Engine: isParentFieldEnabled
Engine->>Engine: build IndexWriterConfig (iwc)
alt isParentFieldEnabled == true
Engine->>Engine: iwc.setParentField(PARENT_FIELD)
end
Engine->>Lucene: pruneUnreferencedFiles(segmentsFile, dir, isParentFieldEnabled)
Lucene->>Lucene: create IndexWriter with provided iwc
Lucene->>Lucene: remove unreferenced files / write segments
Lucene-->>Engine: return SegmentInfos
Engine->>Engine: close IndexWriter / finalize cleanup
Engine-->>Caller: cleanup complete (or logged error)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~30 minutes
Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: defaults Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Signed-off-by: Divyansh Sharma <vishdivs@amazon.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
server/src/main/java/org/opensearch/index/engine/Engine.java (1)
1363-1377: IndexWriterConfig incleanUpUnreferencedFilesis aligned with new parent-field behaviorUsing an explicit
IndexWriterConfigand gatingiwc.setParentFieldonstore.shouldSetParentField()keeps this cleanup path consistent with the rest of the index-writer usage while still being best-effort (exceptions only logged). The implementation looks correct.server/src/main/java/org/opensearch/index/store/Store.java (1)
231-237: Unify parent-field gating logic to use cached flagsThe gating itself looks correct (require index version ≥ 3.2.0 and index sort enabled), and
newIndexWriterConfigandshouldSetParentField()are logically consistent. To avoid duplicated version checks and keep behavior from drifting, you could simplify:public boolean shouldSetParentField() { return isParentFieldEnabledVersion && isIndexSortEnabled; }and keep using the same flags in
newIndexWriterConfig.Also applies to: 256-258, 1978-1988
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
server/src/main/java/org/opensearch/common/lucene/Lucene.java(2 hunks)server/src/main/java/org/opensearch/index/engine/Engine.java(1 hunks)server/src/main/java/org/opensearch/index/store/Store.java(1 hunks)server/src/main/java/org/opensearch/repositories/blobstore/FileRestoreContext.java(1 hunks)server/src/test/java/org/opensearch/common/lucene/LuceneTests.java(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
server/src/main/java/org/opensearch/repositories/blobstore/FileRestoreContext.java (1)
server/src/main/java/org/opensearch/common/lucene/Lucene.java (1)
Lucene(114-992)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: gradle-check
- GitHub Check: copy-issue-labels
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: assemble (21, ubuntu-24.04-arm)
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: Analyze (java)
🔇 Additional comments (2)
server/src/main/java/org/opensearch/repositories/blobstore/FileRestoreContext.java (1)
201-209: Propagating parent-field configuration into restore pruning looks correctWiring
store.shouldSetParentField()intoLucene.pruneUnreferencedFileskeeps the restore cleanup path consistent with the store’s index writer configuration, without changing error handling or control flow. No issues spotted here.server/src/main/java/org/opensearch/common/lucene/Lucene.java (1)
200-235: Boolean parameterization ofpruneUnreferencedFilesis consistent and backward-safeAdding
isParentFieldEnabledand only callingiwc.setParentFieldwhen true preserves the previous behavior when false, while enabling the new parent-field-aware path for callers that opt in viaStore.shouldSetParentField(). Locking and pruning semantics remain unchanged.
server/src/test/java/org/opensearch/common/lucene/LuceneTests.java
Outdated
Show resolved
Hide resolved
|
❌ Gradle check result for a9784dd: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
See my comment here: #20073 (comment) |
Signed-off-by: Divyansh Sharma <vishdivs@amazon.com>
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
server/src/test/java/org/opensearch/common/lucene/LuceneTests.java(4 hunks)server/src/test/java/org/opensearch/index/engine/InternalEngineTests.java(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
server/src/test/java/org/opensearch/common/lucene/LuceneTests.java (1)
server/src/main/java/org/opensearch/common/lucene/Lucene.java (1)
Lucene(114-992)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: precommit (21, macos-15)
- GitHub Check: precommit (21, ubuntu-24.04-arm)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: detect-breaking-change
- GitHub Check: assemble (21, ubuntu-24.04-arm)
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: Analyze (java)
- GitHub Check: check-files
🔇 Additional comments (5)
server/src/test/java/org/opensearch/common/lucene/LuceneTests.java (5)
63-63: LGTM!The
Sortimport is required for configuring index sort in the test methods.
160-169: LGTM!The test correctly sets up an index with parent field and index sort enabled, then exercises the pruning path with
isParentFieldEnabled=true. This properly validates the index-sort scenario that was affected by the bug in PR #18536.
171-178: LGTM! Previous review concern addressed.This test correctly exercises the
isParentFieldEnabled=falsepath by omitting both parent field and index sort configuration. This resolves the concern raised in the previous review that both test variants were exercising the same code path.
180-180: LGTM!The helper method signature correctly accepts the
isParentFieldEnabledparameter and properly threads it through to theLucene.pruneUnreferencedFilescall. This enables both test variants to exercise their respective code paths.Also applies to: 211-211
232-263: LGTM! Good defensive test coverage.This test verifies that attempting to prune an index created with parent field enabled while passing
isParentFieldEnabled=falsecorrectly throws an exception. This validates proper error handling for configuration mismatches during restore operations.Minor observation: The test uses
TextField.Store.YESconsistently, while other test methods userandom().nextBoolean() ? Field.Store.YES : Field.Store.NO. This difference is acceptable since the focus here is on testing the exception path rather than general pruning behavior.
server/src/test/java/org/opensearch/index/engine/InternalEngineTests.java
Show resolved
Hide resolved
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #20284 +/- ##
=========================================
Coverage 73.20% 73.21%
- Complexity 71745 71749 +4
=========================================
Files 5795 5795
Lines 328304 328302 -2
Branches 47283 47281 -2
=========================================
+ Hits 240334 240360 +26
- Misses 68663 68687 +24
+ Partials 19307 19255 -52 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
server/src/test/java/org/opensearch/index/engine/InternalEngineTests.java
Outdated
Show resolved
Hide resolved
|
❌ Gradle check result for a4fe783: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
server/src/test/java/org/opensearch/repositories/fs/FsRepositoryTests.java (3)
224-351: Consider refactoring to reduce code duplication.The test
testSnapshotAndRestoreWithIndexSort()is ~85% duplicate oftestSnapshotAndRestore()(lines 99-222), with only these differences:
- Parent field helpers are used (
indexDocsWithParentField,deleteRandomDocWithParentField)- Index sort field configuration is added
Consider extracting the common snapshot/restore logic into a parameterized helper method to improve maintainability and reduce the ~128 lines of duplication.
💡 Refactoring approach
Extract a shared helper method that accepts parameters for whether to use parent field and index sort configuration:
private void testSnapshotAndRestoreInternal( boolean withParentField, Settings additionalIndexSettings ) throws IOException, InterruptedException { // Common snapshot/restore logic // Use withParentField to choose between indexDocs/indexDocsWithParentField // Merge additionalIndexSettings into index settings } public void testSnapshotAndRestore() throws IOException, InterruptedException { testSnapshotAndRestoreInternal(false, Settings.EMPTY); } public void testSnapshotAndRestoreWithIndexSort() throws IOException, InterruptedException { Settings indexSortSettings = Settings.builder() .put("index.sort.field", "foo1") .build(); testSnapshotAndRestoreInternal(true, indexSortSettings); }
411-431: LGTM with optional refactoring suggestion.The method correctly configures the parent field (line 423) for deleting documents in index-sort scenarios. However, this is ~95% duplicate of
deleteRandomDoc()(lines 390-409).💡 Optional refactoring to reduce duplication
Consider consolidating both methods:
private void deleteRandomDoc(Directory directory) throws IOException { deleteRandomDocInternal(directory, null); } private void deleteRandomDocWithParentField(Directory directory) throws IOException { deleteRandomDocInternal(directory, Lucene.PARENT_FIELD); } private void deleteRandomDocInternal(Directory directory, String parentField) throws IOException { IndexWriterConfig config = newIndexWriterConfig(random(), new MockAnalyzer(random())) .setCodec(TestUtil.getDefaultCodec()) .setMergePolicy(new FilterMergePolicy(NoMergePolicy.INSTANCE) { @Override public boolean keepFullyDeletedSegment(IOSupplier<CodecReader> readerIOSupplier) { return true; } }); if (parentField != null) { config.setParentField(parentField); } try (IndexWriter writer = new IndexWriter(directory, config)) { final int numDocs = writer.getDocStats().numDocs; writer.deleteDocuments(new Term("id", "" + randomIntBetween(0, numDocs - 1))); writer.commit(); assertEquals(writer.getDocStats().numDocs, numDocs - 1); } }
459-484: LGTM with optional refactoring suggestion.The method correctly configures the parent field (line 464) for indexing documents in index-sort scenarios. The use of "foo1" as the field name (line 473) aligns with the sort field configured in the test.
💡 Optional refactoring to reduce duplication
Similar to
deleteRandomDocWithParentField, consider consolidating withindexDocs():private int indexDocs(Directory directory) throws IOException { return indexDocsInternal(directory, null, "body"); } private int indexDocsWithParentField(Directory directory) throws IOException { return indexDocsInternal(directory, Lucene.PARENT_FIELD, "foo1"); } private int indexDocsInternal(Directory directory, String parentField, String textFieldName) throws IOException { IndexWriterConfig config = newIndexWriterConfig(random(), new MockAnalyzer(random())) .setCodec(TestUtil.getDefaultCodec()); if (parentField != null) { config.setParentField(parentField); } try (IndexWriter writer = new IndexWriter(directory, config)) { int docs = 1 + random().nextInt(100); for (int i = 0; i < docs; i++) { Document doc = new Document(); doc.add(new StringField("id", "" + i, random().nextBoolean() ? Field.Store.YES : Field.Store.NO)); doc.add( new TextField( textFieldName, TestUtil.randomRealisticUnicodeString(random()), random().nextBoolean() ? Field.Store.YES : Field.Store.NO ) ); doc.add(new SortedDocValuesField("dv", new BytesRef(TestUtil.randomRealisticUnicodeString(random())))); writer.addDocument(doc); } writer.commit(); return docs; } }
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
CHANGELOG.md(1 hunks)server/src/test/java/org/opensearch/index/engine/InternalEngineTests.java(1 hunks)server/src/test/java/org/opensearch/repositories/fs/FsRepositoryTests.java(3 hunks)
✅ Files skipped from review due to trivial changes (1)
- CHANGELOG.md
🚧 Files skipped from review as they are similar to previous changes (1)
- server/src/test/java/org/opensearch/index/engine/InternalEngineTests.java
🧰 Additional context used
🧬 Code graph analysis (1)
server/src/test/java/org/opensearch/repositories/fs/FsRepositoryTests.java (1)
server/src/main/java/org/opensearch/common/lucene/Lucene.java (1)
Lucene(114-992)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
- GitHub Check: gradle-check
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: precommit (21, ubuntu-24.04-arm)
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: precommit (21, macos-15)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: Analyze (java)
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: assemble (21, ubuntu-24.04-arm)
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: detect-breaking-change
server/src/test/java/org/opensearch/repositories/fs/FsRepositoryTests.java
Outdated
Show resolved
Hide resolved
|
❌ Gradle check result for 27ee583: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Divyansh Sharma <vishdivs@amazon.com>
|
❌ Gradle check result for 51bb712: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Andrew Ross <andrross@amazon.com>
…earch-project#20284) Signed-off-by: Divyansh Sharma <vishdivs@amazon.com> Signed-off-by: Andrew Ross <andrross@amazon.com> Co-authored-by: Divyansh Sharma <vishdivs@amazon.com> Co-authored-by: Andrew Ross <andrross@amazon.com>
…earch-project#20284) Signed-off-by: Divyansh Sharma <vishdivs@amazon.com> Signed-off-by: Andrew Ross <andrross@amazon.com> Co-authored-by: Divyansh Sharma <vishdivs@amazon.com> Co-authored-by: Andrew Ross <andrross@amazon.com>
Description
This change is to fix the bug introduced by #18536, which breaks the restore flow for Index Sort enabled indexes, as it by defaults add the Parent Fields in the index fields.
Related Issues
Resolves #[Issue number to be closed when this PR is merged]
#18536
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.
Summary by CodeRabbit
New Features
Bug Fixes
Tests
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.