Skip to content

Add new index and cluster level settings to limit the total primary shards per node and per index#17295

Merged
linuxpi merged 16 commits intoopensearch-project:mainfrom
pandeydivyansh1803:main
Feb 25, 2025
Merged

Add new index and cluster level settings to limit the total primary shards per node and per index#17295
linuxpi merged 16 commits intoopensearch-project:mainfrom
pandeydivyansh1803:main

Conversation

@pandeydivyansh1803
Copy link
Copy Markdown
Contributor

@pandeydivyansh1803 pandeydivyansh1803 commented Feb 7, 2025

Description

For remote store backed cluster, Segment Replication is used as the replication strategy. With segment replication, segments are created only on primary shard and these segments are copied to the replica shards. As segment creation is CPU intensive, we have observed CPU skew between nodes of the same cluster where primary shards are not balanced.

The earlier attempts to rebalance primary shards across nodes (#6422, #12250) are definitely helping to reduce the skew but they work on the best effort basis and don’t add any constraint.

Implement new setting in OpenSearch:
index.routing.allocation.total_primary_shards_per_node: An index-level setting to limit primary shards per node for a specific index. Store this limit (indexTotalPrimaryShardsPerNodeLimit) in index metadata, similar to indexTotalShardsPerNodeLimit.
cluster.routing.allocation.total_primary_shards_per_node: A cluster-level setting to limit the total primary shards on a node.

These settings will enhance control over primary shard distribution, improving cluster balance and performance management.
The existing ShardsLimitAllocationDecider class already contains the necessary infrastructure and logic to evaluate shard allocation constraints. It has access to the current cluster state, routing information, and methods to check shard counts per node. Given this existing functionality, we propose implementing the two new primary shard limit settings within this class. This approach leverages the current decision-making framework, ensuring consistency with existing allocation rules and minimizing code duplication. By extending the ShardsLimitAllocationDecider, we can efficiently integrate the new primary shard limit checks into the existing allocation decision process.

Related Issues

Resolves #17293

Check List

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Feb 7, 2025

❌ Gradle check result for 721865e: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Feb 7, 2025

❌ Gradle check result for 920f71a: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Feb 9, 2025

✅ Gradle check result for ebb6a2b: SUCCESS

Signed-off-by: Divyansh Pandey <98746046+pandeydivyansh1803@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

✅ Gradle check result for 1eb6f20: SUCCESS

Divyansh Pandey added 2 commits February 24, 2025 14:36
Signed-off-by: Divyansh Pandey <dpaandey@amazon.com>
…chFork

Merge main to sync changelog updates with local changes.
@github-actions
Copy link
Copy Markdown
Contributor

❕ Gradle check result for 28def93: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@linuxpi
Copy link
Copy Markdown
Contributor

linuxpi commented Feb 24, 2025

@pandeydivyansh1803 Changes LGTM. Please add some tests to cover the cases where user is not able to set the new index/cluster settings for non remote store cluster step.

…et for cluster which is not remote store enabled

Signed-off-by: Divyansh Pandey <dpaandey@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for 36d29c8: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

✅ Gradle check result for 36d29c8: SUCCESS

@linuxpi linuxpi merged commit bc209ee into opensearch-project:main Feb 25, 2025
30 checks passed
@github-project-automation github-project-automation bot moved this from 👀 In review to ✅ Done in Storage Project Board Feb 25, 2025
@linuxpi linuxpi added the backport 2.x Backport to 2.x branch label Feb 26, 2025
@opensearch-trigger-bot
Copy link
Copy Markdown
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-17295-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 bc209ee6bacbb1027dcd7ba28d56b6ceb96f4fe0
# Push it to GitHub
git push --set-upstream origin backport/backport-17295-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-17295-to-2.x.

@linuxpi
Copy link
Copy Markdown
Contributor

linuxpi commented Feb 26, 2025

@pandeydivyansh1803 can you manually raise the backport PR?

pandeydivyansh1803 pushed a commit to pandeydivyansh1803/OpenSearchFork that referenced this pull request Feb 27, 2025
Signed-off-by: Divyansh Pandey <dpaandey@amazon.com>
vinaykpud pushed a commit to vinaykpud/OpenSearch that referenced this pull request Mar 18, 2025
…hards per node and per index (opensearch-project#17295)

* Added a new index level setting to limit the total primary shards per index per node. Added relevant files for unit test and integration test.

Signed-off-by: Divyansh Pandey <dpaandey@amazon.com>

* update files for code quality

Signed-off-by: Divyansh Pandey <dpaandey@amazon.com>

* moved primary shard count function to RoutingNode.java

Signed-off-by: Divyansh Pandey <dpaandey@amazon.com>

* removed unwanted files

Signed-off-by: Divyansh Pandey <dpaandey@amazon.com>

* added cluster level setting to limit total primary shards per node

Signed-off-by: Divyansh Pandey <dpaandey@amazon.com>

* allow the index level settings to be applied to both DOCUMENT and SEGMENT replication indices

Signed-off-by: Divyansh Pandey <dpaandey@amazon.com>

* Added necessary validator to restrict the index and cluster level primary shards per node settings only for remote store enabled cluster. Added relevant unit and integration tests.

Signed-off-by: Divyansh Pandey <dpaandey@amazon.com>

* refactoring changes

Signed-off-by: Divyansh Pandey <dpaandey@amazon.com>

* refactoring changes

Signed-off-by: Divyansh Pandey <dpaandey@amazon.com>

* Empty commit to rerun gradle test

Signed-off-by: Divyansh Pandey <dpaandey@amazon.com>

* optimised the calculation of total primary shards on a node

Signed-off-by: Divyansh Pandey <dpaandey@amazon.com>

* Refactoring changes

Signed-off-by: Divyansh Pandey <dpaandey@amazon.com>

* refactoring changes, added TODO to MetadataCreateIndexService

Signed-off-by: Divyansh Pandey <dpaandey@amazon.com>

* Added integration test for scenario where primary shards setting is set for cluster which is not remote store enabled

Signed-off-by: Divyansh Pandey <dpaandey@amazon.com>

---------

Signed-off-by: Divyansh Pandey <dpaandey@amazon.com>
Signed-off-by: Divyansh Pandey <98746046+pandeydivyansh1803@users.noreply.github.com>
Co-authored-by: Divyansh Pandey <dpaandey@amazon.com>
Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport 2.x Backport to 2.x branch backport-failed enhancement Enhancement or improvement to existing feature or request _No response_

Projects

Status: ✅ Done

Development

Successfully merging this pull request may close these issues.

[Feature Request] Primary Shard Count Constraint

6 participants