Skip to content

Add adaptive merge policy to reduce benchmark variance#19352

Closed
rgsriram wants to merge 5 commits intoopensearch-project:mainfrom
rgsriram:issue-11163
Closed

Add adaptive merge policy to reduce benchmark variance#19352
rgsriram wants to merge 5 commits intoopensearch-project:mainfrom
rgsriram:issue-11163

Conversation

@rgsriram
Copy link
Copy Markdown
Contributor

@rgsriram rgsriram commented Sep 21, 2025

Description

Introduced an adaptive merge policy system in OpenSearch that dynamically tunes Lucene segment merge settings based on shard size. The solution replaces static defaults with context-aware configurations, improving benchmark consistency and search performance across varying index sizes. Key additions include an AdaptiveTieredMergePolicyProvider for shard-specific optimisations, a SegmentTopologyAnalyzer for segment distribution analysis, and a REST API for real-time monitoring. This change delivers more predictable performance and optimised search behaviour for users.

Related Issues

Resolves #11163

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Summary by CodeRabbit

  • New Features

    • Adaptive merge policy that dynamically tailors merge settings by shard size and provides automated recommendations.
    • Segment topology analyzer and calculator to compute segment metrics and produce merge recommendations.
  • Documentation

    • Changelog entry added and package-level admin REST documentation included.
  • Tests

    • New unit tests, test utilities, and benchmarks validating topology analysis and adaptive policy recommendations.

✏️ Tip: You can customize this high-level summary in your review settings.

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for 2b398ff: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for abe1e49: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for 5076ee5: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
server/src/main/java/org/opensearch/index/AdaptiveTieredMergePolicyProvider.java (2)

26-29: Documentation describes implementation details not present in this class.

The javadoc mentions "smooth interpolation (log/linear across size decades)" (lines 27-29), but this logic doesn't exist in AdaptiveTieredMergePolicyProvider. The class simply initializes a wrapped AdaptiveOpenSearchTieredMergePolicy with static defaults. If the interpolation logic resides in the wrapped policy, update the documentation to clarify that this provider delegates adaptive behavior to AdaptiveOpenSearchTieredMergePolicy, rather than implying this class performs the computation.

📝 Suggested documentation clarification
  * Implementation notes:
- * - Uses smooth interpolation (log/linear across size decades) instead of hard
- * categories
- * to avoid abrupt parameter jumps as shards grow.
+ * - Delegates to AdaptiveOpenSearchTieredMergePolicy which uses smooth 
+ * interpolation (log/linear across size decades) to avoid abrupt parameter 
+ * jumps as shards grow.
  * - Caps the max merged segment size at 5GB to align with Lucene defaults.

59-77: Consider acquiring settingsLock when mutating policy parameters.

applyDefaultSettings() modifies tieredMergePolicy state (lines 65–76) without acquiring settingsLock. While this is currently safe because the method is only called from the constructor (before the object is published), future refactorings or concurrent access patterns could introduce races. For consistency and defensive coding, consider wrapping the setter calls in a write lock:

🔒 Proposed lock usage
 private void applyDefaultSettings() {
+    settingsLock.writeLock().lock();
+    try {
         // Fallback to the original default settings, ensuring parity with
         // TieredMergePolicyProvider
         // We use explicit values here, but they match the constants in
         // TieredMergePolicyProvider:
         // DEFAULT_MAX_MERGED_SEGMENT = 5GB
         tieredMergePolicy.setMaxMergedSegmentMB(5 * 1024);
         // DEFAULT_FLOOR_SEGMENT = 16MB
         tieredMergePolicy.setFloorSegmentMB(16);
         // DEFAULT_SEGMENTS_PER_TIER = 10.0
         tieredMergePolicy.setSegmentsPerTier(10.0);
         // DEFAULT_MAX_MERGE_AT_ONCE = 30
         tieredMergePolicy.setMaxMergeAtOnce(30);
         // DEFAULT_EXPUNGE_DELETES_ALLOWED = 10.0
         tieredMergePolicy.setForceMergeDeletesPctAllowed(10.0);
         // DEFAULT_DELETES_PCT_ALLOWED = 20.0
         tieredMergePolicy.setDeletesPctAllowed(20.0);
         tieredMergePolicy.setNoCFSRatio(TieredMergePolicy.DEFAULT_NO_CFS_RATIO);
+    } finally {
+        settingsLock.writeLock().unlock();
+    }
 }
server/src/test/java/org/opensearch/index/SegmentTopologyBenchmarkTests.java (1)

246-262: Consider tracking segments by identity rather than size value.

Line 253 uses trackedSizes.remove(Long.valueOf(info.sizeInBytes())), which removes by object equality. If multiple segments have identical sizes, this will only remove the first matching value, potentially causing tracking drift. While collisions may be rare due to the random noise added at line 221, using a different data structure (e.g., parallel tracking with indices or a Map) would be more robust.

🔎 Alternative approach using indexed tracking

Replace trackedSizes List with a Map that tracks segment name to size:

-    List<Long> trackedSegmentSizes = new ArrayList<>();
+    Map<String, Long> trackedSegmentSizes = new HashMap<>();
     // ...
     for (int i = 0; i < 5; i++) {
         long size = 50L * 1024 * 1024;
         String name = "_init_" + i;
         segmentInfos.add(createSegmentCommitInfo(name, size));
-        trackedSegmentSizes.add(size);
+        trackedSegmentSizes.put(name, size);
     }
     // ...
     // When adding new segment:
-    trackedSegmentSizes.add(newSegmentSize);
+    trackedSegmentSizes.put(newSegmentName, newSegmentSize);
     
     // In applyMerges:
     for (SegmentCommitInfo info : merge.segments) {
         newSize += info.sizeInBytes();
         infos.remove(info);
-        trackedSizes.remove(Long.valueOf(info.sizeInBytes()));
+        trackedSizes.remove(info.info.name);
     }
-    trackedSizes.add(newSize);
+    trackedSizes.put(mergedName, newSize);
     
     // When calculating metrics:
-    SegmentTopologyMetrics metrics = calculateTopologyMetrics(trackedSegmentSizes);
+    SegmentTopologyMetrics metrics = calculateTopologyMetrics(new ArrayList<>(trackedSegmentSizes.values()));
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a38b165 and e295923.

📒 Files selected for processing (2)
  • server/src/main/java/org/opensearch/index/AdaptiveTieredMergePolicyProvider.java
  • server/src/test/java/org/opensearch/index/SegmentTopologyBenchmarkTests.java
🧰 Additional context used
🧬 Code graph analysis (1)
server/src/test/java/org/opensearch/index/SegmentTopologyBenchmarkTests.java (1)
server/src/test/java/org/opensearch/index/analysis/SegmentTopologyTestUtils.java (1)
  • SegmentTopologyTestUtils (19-53)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: gradle-check
  • GitHub Check: detect-breaking-change
  • GitHub Check: assemble (25, ubuntu-24.04-arm)
  • GitHub Check: assemble (21, ubuntu-24.04-arm)
  • GitHub Check: assemble (21, ubuntu-latest)
  • GitHub Check: assemble (25, windows-latest)
  • GitHub Check: assemble (25, ubuntu-latest)
  • GitHub Check: assemble (21, windows-latest)
  • GitHub Check: Analyze (java)
  • GitHub Check: precommit (25, macos-15-intel)
  • GitHub Check: precommit (21, macos-15)
  • GitHub Check: precommit (25, ubuntu-latest)
  • GitHub Check: precommit (21, macos-15-intel)
  • GitHub Check: precommit (25, ubuntu-24.04-arm)
  • GitHub Check: precommit (21, ubuntu-latest)
  • GitHub Check: precommit (21, windows-2025, true)
  • GitHub Check: precommit (25, macos-15)
  • GitHub Check: precommit (21, windows-latest)
  • GitHub Check: precommit (21, ubuntu-24.04-arm)
  • GitHub Check: precommit (25, windows-latest)
🔇 Additional comments (17)
server/src/main/java/org/opensearch/index/AdaptiveTieredMergePolicyProvider.java (3)

34-39: LGTM!

Field declarations are correct. All fields are properly marked final, and the settingsLock provides thread-safety for concurrent merge policy modifications as intended.


41-57: LGTM!

Constructor correctly initializes the adaptive policy with thread-safe lock, reads the merge-enabled setting, logs appropriate warnings, and applies defaults.


79-82: LGTM!

The getMergePolicy() method correctly returns the adaptive policy when merges are enabled, or NoMergePolicy otherwise. The conditional logic is clear and the use of fully qualified names improves readability.

server/src/test/java/org/opensearch/index/SegmentTopologyBenchmarkTests.java (14)

1-44: LGTM! Clean test class structure.

The imports, class declaration, and javadoc are well-organized. The past issue regarding the @opensearch.api tag has been properly addressed.


50-56: LGTM! Good coverage of shard size scenarios.

The test appropriately validates recommendations across a representative range of shard sizes from 50MB to 50GB.


61-69: LGTM! Clear validation of topology quality metrics.

The test effectively demonstrates that optimal topologies produce lower variance and better balance scores compared to pathological cases.


126-136: LGTM! Thorough boundary condition testing.

The test properly validates all category boundaries, ensuring the categorization logic works correctly at the threshold values.


141-161: LGTM! Comprehensive regression protection.

The test validates all merge policy parameters with exact expected values, providing strong regression protection. The previous issue regarding max segment validation has been properly addressed.


163-179: LGTM! Solid sanity checks for recommendations.

The validation method appropriately verifies that recommendations are positive, within expected bounds, and maintain proper relationships (e.g., floor < max).


181-198: LGTM! Effective topology simulation for benchmarking.

The bad and optimal topology scenarios clearly demonstrate the impact of segment distribution on variance and balance metrics, providing good test coverage.


264-291: LGTM! Proper mock setup for segment simulation.

The method correctly creates mock segments with realistic file structure and size reporting, suitable for testing merge policy behavior.


293-303: LGTM! Clear categorization logic.

The shard size categorization is straightforward and aligns with the boundaries documented in the javadoc (lines 121-124).


305-336: LGTM! Comprehensive validation with regression protection.

The method properly validates all merge policy parameters with both sanity checks and exact value matching. The previous issue regarding missing max segment validation has been resolved.


338-352: LGTM! Numerically sound variance calculation.

The method correctly calculates variance using double precision to avoid overflow, addressing the previous review concern.


354-362: LGTM! Intuitive balance score calculation.

The balance score formula (1 - range/max) provides an intuitive metric where 1.0 represents perfect balance and lower values indicate increasing imbalance.


364-395: LGTM! Clean inner class definitions.

The inner classes are well-structured data holders appropriate for test metrics. The SegmentTopologyMetrics class correctly uses double for variance as addressed in previous reviews.


83-116: The adaptive policy is properly configured through segment topology at merge time.

The AdaptiveOpenSearchTieredMergePolicy is designed to dynamically compute shard size from segment metadata in the findMerges() method rather than requiring explicit configuration at construction. The test properly exercises this behavior: simulateMergePerformance() creates realistic SegmentInfos and invokes findMerges(), which triggers the adaptive logic to recalculate merge settings based on the current segment topology. No additional shard size configuration is needed or expected.

Likely an incorrect or invalid review comment.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Jan 1, 2026

❌ Gradle check result for e295923: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Sriram Ganesh <srignsh22@gmail.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
server/src/test/java/org/opensearch/index/SegmentTopologyBenchmarkTests.java (2)

221-227: Minor typo in comment.

Line 222: "Flashing" should be "Flushing".

🔎 Proposed fix
         for (int i = 0; i < iterations; i++) {
-            // 1. Simulate Flashing a new segment
+            // 1. Simulate Flushing a new segment
             long newSegmentSize = docsPerFlush * baseDocSize + randomIntBetween(0, 1024 * 1024); // Add some noise

340-354: Consider adding empty list guard for consistency.

calculateBalanceScore handles the empty list case (line 357), but calculateTopologyMetrics would throw ArithmeticException on line 342 if passed an empty list. While current callers always pass non-empty lists, adding a guard would be more defensive.

🔎 Proposed fix
     private SegmentTopologyMetrics calculateTopologyMetrics(List<Long> segmentSizes) {
+        if (segmentSizes.isEmpty()) {
+            return new SegmentTopologyMetrics(0.0, 0.0);
+        }
         long totalSize = segmentSizes.stream().mapToLong(Long::longValue).sum();
         long meanSize = totalSize / segmentSizes.size();
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e295923 and e40dce4.

📒 Files selected for processing (2)
  • server/src/main/java/org/opensearch/index/AdaptiveTieredMergePolicyProvider.java
  • server/src/test/java/org/opensearch/index/SegmentTopologyBenchmarkTests.java
🧰 Additional context used
🧬 Code graph analysis (1)
server/src/test/java/org/opensearch/index/SegmentTopologyBenchmarkTests.java (1)
server/src/test/java/org/opensearch/index/analysis/SegmentTopologyTestUtils.java (1)
  • SegmentTopologyTestUtils (19-53)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: gradle-check
  • GitHub Check: detect-breaking-change
  • GitHub Check: assemble (21, windows-latest)
  • GitHub Check: assemble (21, ubuntu-latest)
  • GitHub Check: Analyze (java)
  • GitHub Check: assemble (25, ubuntu-latest)
  • GitHub Check: assemble (21, ubuntu-24.04-arm)
  • GitHub Check: assemble (25, windows-latest)
  • GitHub Check: assemble (25, ubuntu-24.04-arm)
  • GitHub Check: precommit (25, ubuntu-latest)
  • GitHub Check: precommit (21, macos-15)
  • GitHub Check: precommit (21, ubuntu-24.04-arm)
  • GitHub Check: precommit (21, ubuntu-latest)
  • GitHub Check: precommit (21, macos-15-intel)
  • GitHub Check: precommit (25, macos-15)
  • GitHub Check: precommit (25, windows-latest)
  • GitHub Check: precommit (21, windows-latest)
  • GitHub Check: precommit (25, macos-15-intel)
  • GitHub Check: precommit (25, ubuntu-24.04-arm)
  • GitHub Check: precommit (21, windows-2025, true)
🔇 Additional comments (3)
server/src/main/java/org/opensearch/index/AdaptiveTieredMergePolicyProvider.java (1)

1-88: LGTM! Well-structured adaptive merge policy provider.

The implementation correctly addresses the concerns from previous reviews:

  • Uses INDEX_MERGE_ENABLED constant instead of hardcoded string
  • Removed dead code (categorization methods, unused settings update method)
  • Proper thread-safety with ReentrantReadWriteLock passed to the underlying policy
  • Default values align with TieredMergePolicyProvider (5GB max segment, 16MB floor, etc.)

The lock pattern in applyDefaultSettings() with try-finally is correct, and returning the policy reference from getMergePolicy() without locking is safe since both mergesEnabled and tieredMergePolicy are final fields set once in the constructor.

server/src/test/java/org/opensearch/index/SegmentTopologyBenchmarkTests.java (2)

46-163: Well-structured benchmark tests with good regression coverage.

The test file properly addresses the previous review feedback:

  • Removed the @opensearch.api tag from the test class
  • Variance calculation now uses double to prevent overflow (lines 344-349)
  • Added expectedMaxSegment parameter to validateMergeSettings for complete regression protection (lines 307-338)

The tests provide good coverage for validating the adaptive merge policy across different shard sizes and comparing performance characteristics between default and adaptive policies.


83-85: Remove the unnecessary @SuppressForbidden annotation.

The test method does not call any forbidden APIs. The @Seed annotation is from the randomizedtesting library and is not subject to forbidden-apis restrictions. The @SuppressForbidden annotation serves no purpose here and should be removed.

    @Seed("DEADBEEF")
    public void testPerformanceComparison() throws IOException {

Likely an incorrect or invalid review comment.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Jan 1, 2026

❌ Gradle check result for e40dce4: TIMEOUT

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Sriram Ganesh <srignsh22@gmail.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e40dce4 and da9f024.

📒 Files selected for processing (1)
  • server/src/test/java/org/opensearch/index/SegmentTopologyBenchmarkTests.java
🧰 Additional context used
🧬 Code graph analysis (1)
server/src/test/java/org/opensearch/index/SegmentTopologyBenchmarkTests.java (1)
server/src/test/java/org/opensearch/index/analysis/SegmentTopologyTestUtils.java (1)
  • SegmentTopologyTestUtils (19-53)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: gradle-check
  • GitHub Check: assemble (21, windows-latest)
  • GitHub Check: assemble (25, ubuntu-latest)
  • GitHub Check: assemble (25, ubuntu-24.04-arm)
  • GitHub Check: assemble (25, windows-latest)
  • GitHub Check: assemble (21, ubuntu-latest)
  • GitHub Check: assemble (21, ubuntu-24.04-arm)
  • GitHub Check: precommit (25, windows-latest)
  • GitHub Check: precommit (25, macos-15)
  • GitHub Check: precommit (21, macos-15)
  • GitHub Check: precommit (21, macos-15-intel)
  • GitHub Check: precommit (21, windows-2025, true)
  • GitHub Check: precommit (25, ubuntu-latest)
  • GitHub Check: precommit (21, ubuntu-24.04-arm)
  • GitHub Check: precommit (25, ubuntu-24.04-arm)
  • GitHub Check: Analyze (java)
  • GitHub Check: precommit (21, windows-latest)
  • GitHub Check: precommit (21, ubuntu-latest)
  • GitHub Check: precommit (25, macos-15-intel)
  • GitHub Check: detect-breaking-change
🔇 Additional comments (6)
server/src/test/java/org/opensearch/index/SegmentTopologyBenchmarkTests.java (6)

1-46: LGTM!

License header, imports, and class declaration are well-structured. The @opensearch.api tag removal from the previous review has been addressed.


52-71: LGTM!

Test coverage for recommendations validation and topology variance impact is well-structured with clear assertions.


131-166: LGTM!

The categorization boundary tests and merge settings validation provide solid regression protection. The previous review feedback about adding expectedMaxSegment has been addressed with exact values now being validated for all three parameters.


269-296: LGTM!

The createSegmentCommitInfo helper correctly sets up the mock Directory to handle the .cfe file extension, and setFiles() is consistent with this mock configuration. The hardcoded zero values for delete counts are appropriate for this simulation context.


343-370: LGTM!

The variance overflow issue from the previous review has been addressed by using double arithmetic. The balance score calculation and edge case handling are correct.


372-403: LGTM!

The inner data classes are well-structured for test-only use. The SegmentTopologyMetrics class now correctly uses double for variance, addressing the previous review feedback.

Comment on lines +83 to +85
@Seed("DEADBEEF")
@SuppressForbidden(reason = "Benchmark requires deterministic seed")
public void testPerformanceComparison() throws IOException {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Read the complete test method to check for forbidden APIs
cat -n server/src/test/java/org/opensearch/index/SegmentTopologyBenchmarkTests.java | head -150

Repository: opensearch-project/OpenSearch

Length of output: 8329


🏁 Script executed:

#!/bin/bash
# Search for forbidden-apis configuration in OpenSearch
fd -t f "forbidden" -E ".git" -E "target" | head -20

Repository: opensearch-project/OpenSearch

Length of output: 879


🏁 Script executed:

#!/bin/bash
# Find examples of @SuppressForbidden usage to understand what triggers it
rg "@SuppressForbidden" --type java -B2 -A5 | head -100

Repository: opensearch-project/OpenSearch

Length of output: 12759


🏁 Script executed:

#!/bin/bash
# Check if this class uses any System.* or other commonly forbidden APIs
rg "System\.(getenv|exit|setSecurityManager|setOut|setErr|setIn)" server/src/test/java/org/opensearch/index/SegmentTopologyBenchmarkTests.java

Repository: opensearch-project/OpenSearch

Length of output: 55


Remove @SuppressForbidden annotation—no forbidden APIs are used.

The method contains no forbidden API calls (no System.*, Thread.suspend/resume, HttpServer, etc.). The @SuppressForbidden annotation is designed to suppress legitimate forbidden API usage; applying it with a reason about deterministic seeds is incorrect. The @Seed annotation from randomizedtesting is not a forbidden API and requires no suppression.

🤖 Prompt for AI Agents
In server/src/test/java/org/opensearch/index/SegmentTopologyBenchmarkTests.java
around lines 83 to 85, remove the unnecessary @SuppressForbidden annotation on
the testPerformanceComparison method; the method does not use any forbidden APIs
and the @Seed annotation is sufficient for deterministic seeding. Delete the
@SuppressForbidden line and its reason text so only @Seed("DEADBEEF") remains
above the method signature.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Jan 2, 2026

❌ Gradle check result for da9f024: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Sriram Ganesh <srignsh22@gmail.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Jan 2, 2026

✅ Gradle check result for b1d79c9: SUCCESS

@rgsriram
Copy link
Copy Markdown
Contributor Author

@msfroh - Could you please review?.

@atris
Copy link
Copy Markdown
Contributor

atris commented Jan 12, 2026

Not a fan of this approach.

Hardcoding 5GB/10GB thresholds in the merge policy feels like a heuristic that won't age well, especially with mixed workloads (high deletes, small hot indices). It introduces "sloshing" behavior where a shard can flip-flop configs just by crossing an arbitrary byte boundary.

Also, MergePolicy is supposed to be a stateless factory. Mutating the config inside findMerges is an anti-pattern; it makes debugging concurrency issues a nightmare.

If we need adaptive merging, it belongs in ISM where we see the whole cluster, not hacked into the low-level Lucene merge logic. I'm -1 on merging this.

double logSize = Math.log10(Math.max(1L, shardSizeBytes));
// Thresholds are based on decimal powers: log10 < 8.0 ≈ < 100MB, < 9.0 ≈ < 1GB, etc.
// Returned sizes use binary units (MiB/GiB via 1024²)
if (logSize < 8.0) { // < 10^8 bytes (≈ 95.4 MiB)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoding static byte-size thresholds (50MB, 1GB, 5GB) is brittle. A 5GB index on NVMe behaves differently than on spinning disk; using a fixed "one size fits all" curve here ignores hardware capabilities and I/O pressure.

// Returned sizes use binary units (MiB/GiB via 1024²)
if (logSize < 8.0) { // < 10^8 bytes (≈ 95.4 MiB)
return 50L * 1024 * 1024; // 50 MiB
} else if (logSize < 9.0) { // 10^8 - 10^9 bytes (≈ 95.4 MiB - 953.7 MiB)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please link the specific OSB results that empirically validate 5GB as the optimal inflection point for this curve. Without data showing a statistically significant regression at 4.9GB vs 5.1GB, these magic numbers are risky.


// Synchronize settings updates to prevent race conditions with concurrent
// setter calls
settingsLock.writeLock().lock();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

findMerges runs under IndexWriter's concurrency model. Introducing this lock asks for deadlocks if
IndexWriter holds its own monitor. Have you verified the lock ordering here?

// setter calls
settingsLock.writeLock().lock();
try {
regularMergePolicy.setMaxMergedSegmentMB(maxMergedSegmentMB);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IndexSettings instantiates this policy once per index (via the Provider). This means all shards share this single mutable instance. A large shard will flip these settings for a small neighbor concurrently, causing non-deterministic behavior.

for (Segment segment : segments) {
sizes.add(segment.sizeBytes);
}
Collections.sort(sizes);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allocating and sorting this on the heap per-request is a clear DoS vector on nodes with 50k+ segments. We need a streaming collector here, not a full sort.

@atris
Copy link
Copy Markdown
Contributor

atris commented Jan 12, 2026

The PR description refers to adding a REST API for real time monitoring but the PR doesnt seem to have the same. Is that a feature planned for the future?

@rgsriram
Copy link
Copy Markdown
Contributor Author

Thanks @atris, for the detailed review. Closing this PR to rethink the architecture and explore an ISM-based implementation as suggested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Enhancement or improvement to existing feature or request Search:Performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Optimize segment size and merge settings

3 participants