Skip to content

[DNM] mgr/balancer: set upmap_max_deviation to 1#57216

Closed
JoshuaGabriel wants to merge 1 commit intoceph:mainfrom
JoshuaGabriel:josh_65748
Closed

[DNM] mgr/balancer: set upmap_max_deviation to 1#57216
JoshuaGabriel wants to merge 1 commit intoceph:mainfrom
JoshuaGabriel:josh_65748

Conversation

@JoshuaGabriel
Copy link
Contributor

Field experience shows that previous value 5 leads to skewed OSD utilization. Set it to 1 and update the relevant test.

Fixes: https://tracker.ceph.com/issues/65748

Field experience shows that previous value 5 leads to skewed OSD
utilization. Set it to 1 and update the relevant test.

Fixes: https://tracker.ceph.com/issues/65748
Signed-off-by: Joshua Blanch <joshua.blanch@clyso.com>
@JoshuaGabriel JoshuaGabriel requested a review from a team as a code owner May 1, 2024 21:36
@dvanders
Copy link
Contributor

dvanders commented May 1, 2024

Hi @ljflores here's a simple balancer qol improvement.

Copy link
Member

@ljflores ljflores left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I have made this adjustment often in testing.

@yuriw
Copy link
Contributor

yuriw commented May 3, 2024

This PR is under test in https://tracker.ceph.com/issues/65796.

Copy link
Member

@ljflores ljflores left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In talking to @neha-ojha, she mentioned that there may be certain CRUSH rules where deviation 1 doesn't result in good mappings. Putting "Request Changes" and DNM for now so we can look into it.

@ljflores ljflores added the DNM label May 3, 2024
@github-actions
Copy link

github-actions bot commented Jul 2, 2024

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Jul 2, 2024
@kamoltat kamoltat changed the title mgr/balancer: set upmap_max_deviation to 1 [DNM] mgr/balancer: set upmap_max_deviation to 1 Jul 18, 2024
@kamoltat
Copy link
Member

kamoltat commented Jul 18, 2024

FYI This passed RADOS test https://tracker.ceph.com/projects/rados/wiki/MAIN#httpstrackercephcomissues65796

However, we are not merging due to the comments above.

@kamoltat kamoltat removed the stale label Jul 18, 2024
@dvanders
Copy link
Contributor

In talking to @neha-ojha, she mentioned that there may be certain CRUSH rules where deviation 1 doesn't result in good mappings. Putting "Request Changes" and DNM for now so we can look into it.

@neha-ojha do you have more info?

@yuriw
Copy link
Contributor

yuriw commented Jul 18, 2024

@ljflores
Copy link
Member

@neha-ojha

@dvanders This PR will need further testing to prove that this setting works well with current CRUSH configurations in the wild. Neha and I discussed that the experiments previously conducted to set the default to 5 (see #32247 (comment) for the history) should be repeated with current osdmaps from the wild.

I had collected some of these osdmaps for read balancer testing (ref: https://tracker.ceph.com/issues/53622). I plan to use these osdmaps to repeat the experiments done in the past to prove whether this setting is better.

If @JoshuaGabriel also wants to provide proof along those lines, that is fine too. But ultimately this change needs to be further justified against real world CRUSH configurations.

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Sep 16, 2024
@github-actions
Copy link

This pull request has been automatically closed because there has been no activity for 90 days. Please feel free to reopen this pull request (or open a new one) if the proposed change is still appropriate. Thank you for your contribution!

@github-actions github-actions bot closed this Oct 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants