Skip to content

mon,cephfs: require confirmation when changing max_mds on unhealthy cluster#59420

Merged
rishabh-d-dave merged 4 commits intoceph:mainfrom
rishabh-d-dave:max-mds-confirm
Oct 18, 2024
Merged

mon,cephfs: require confirmation when changing max_mds on unhealthy cluster#59420
rishabh-d-dave merged 4 commits intoceph:mainfrom
rishabh-d-dave:max-mds-confirm

Conversation

@rishabh-d-dave
Copy link
Contributor

Fixes: https://tracker.ceph.com/issues/66301

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows
  • jenkins test rook e2e

@rishabh-d-dave rishabh-d-dave requested a review from a team as a code owner August 23, 2024 13:27
@rishabh-d-dave rishabh-d-dave requested review from a team, batrick and vshankar August 23, 2024 13:27
@rishabh-d-dave rishabh-d-dave changed the title mon,cephfs: require confirmation when changing max_mds onunhealthy cluster mon,cephfs: require confirmation when changing max_mds on unhealthy cluster Aug 23, 2024
@joscollin
Copy link
Member

@rishabh-d-dave Otherwise, LGTM

@rishabh-d-dave rishabh-d-dave force-pushed the max-mds-confirm branch 2 times, most recently from 5f3467d to 2de7b1d Compare August 27, 2024 08:25
@rishabh-d-dave rishabh-d-dave requested a review from a team as a code owner August 27, 2024 08:25
@rishabh-d-dave
Copy link
Contributor Author

@joscollin @batrick Thanks for the quick review. I've made the recommended changes now. PTAL.


That prevents any clients from establishing new sessions with the MDS.

* **Dont tweak max_mds** Changing value of ``max_mds`` FS setting variable is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add this as a precautionary note rather than an assertive statement.

Copy link
Contributor Author

@rishabh-d-dave rishabh-d-dave Sep 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please suggest better a wording. How is **Be cautious tweaking max_mds**?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See wording above

@rishabh-d-dave rishabh-d-dave force-pushed the max-mds-confirm branch 2 times, most recently from 7704ae9 to 33bb3cb Compare September 2, 2024 13:09
@rishabh-d-dave
Copy link
Contributor Author

.

Copy link
Contributor

@anthonyeleven anthonyeleven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestions


That prevents any clients from establishing new sessions with the MDS.

* **Dont tweak max_mds** Changing value of ``max_mds`` FS setting variable is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See wording above

@rishabh-d-dave rishabh-d-dave force-pushed the max-mds-confirm branch 2 times, most recently from 2b68078 to 75b6224 Compare September 2, 2024 18:52
@rishabh-d-dave
Copy link
Contributor Author

@vshankar i've made the changes last week that were recommended by you. PTAL when you find some time. :)

@vshankar vshankar requested a review from a team September 10, 2024 09:52
@vshankar
Copy link
Contributor

@vshankar i've made the changes last week that were recommended by you. PTAL when you find some time. :)

I'll have a look tomorrow. Meanwhile requesting a review from @ceph/cephfs.

@rishabh-d-dave
Copy link
Contributor Author

@vshankar i've made the changes last week that were recommended by you. PTAL when you find some time. :)

I'll have a look tomorrow. Meanwhile requesting a review from @ceph/cephfs.

Thank you!

Copy link
Contributor

@vshankar vshankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit, otherwise LGTM.

@rishabh-d-dave
Copy link
Contributor Author

Had a discussion wit Venky here, have made the relevant changes.

@rishabh-d-dave rishabh-d-dave added the wip-rishabh-testing Rishabh's testing label label Sep 30, 2024
Copy link
Contributor

@vshankar vshankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM.

…luster

User must pass the confirmation flag (--yes-i-really-mean-it) to change
the value of CephFS setting variable "max_mds" when the Ceph cluster is
unhealthy.

This measure was decided upon to prevent users from changing "max_mds"
as a measure of troubleshotoing unhealthy cluster.

Fixes: https://tracker.ceph.com/issues/66301
Signed-off-by: Rishabh Dave <ridave@redhat.com>
Add tests to ensure that when cluster has any health warning, especially
MDS_TRIM, confirmation flag is mandatory to change max_mds.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
Update the documentation for CephFs admininstration as well
troubleshooting.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
Add a release note for the fact that users now need to pass the
confirmation flag for modifying "max_mds" when cluster is unhealthy.

Signed-off-by: Rishabh Dave <ridave@redhat.com>
@rishabh-d-dave
Copy link
Contributor Author

This PR is under test in https://tracker.ceph.com/issues/68354.

@rishabh-d-dave
Copy link
Contributor Author

jenkins test api

Copy link
Contributor Author

@rishabh-d-dave rishabh-d-dave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rishabh-d-dave
Copy link
Contributor Author

jenkins test api

@rishabh-d-dave
Copy link
Contributor Author

Ready for merge but waiting for Ceph API CI job to pass

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants