nautilus: ceph-monstore-tool: use a large enough paxos/{first,last}_committed#41874
Merged
yuriw merged 3 commits intoceph:nautilusfrom Jun 17, 2021
Merged
nautilus: ceph-monstore-tool: use a large enough paxos/{first,last}_committed#41874yuriw merged 3 commits intoceph:nautilusfrom
yuriw merged 3 commits intoceph:nautilusfrom
Conversation
so the rebuild paxos transaction won't be overwritten by the ones
created before recovery completes.
when the quorum is recovering, the leader will collect the paxos
transactions from peons. if the quorum accept the proposal for setting
the fingerprint, the peon will update the monitor with the paxos
transaction with a newer "last_committed" than the one created using
update_paxos() in ceph_monstore_tool.cc. the latter "last_committed" is
always 0.
so, to avoid this extra paxos proposal obsoleting the "rebuilding" paxos
transaction, we use a large enough number for {first,last}_committed.
Fixes: http://tracker.ceph.com/issues/38219
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 5475ef7)
for better readability Signed-off-by: Kefu Chai <kchai@redhat.com> (cherry picked from commit 3908c1f)
mon_tick_interval is 5 seconds by default. monitors update their rotating keys every mon_tick_interval. before monitors forms a quorum, the auth requests from clients are put into the wait list. these requests are re-enqueued once the monitors form a quorum. but there is a small window of mon_tick_interval, before they are able to serve the auth requests even after their claim to be able to server requests. if these re-enqueued requests happen to be served in this window, and if authx is enabled, they will be greeted with errors like handle_auth_bad_method server allowed_methods [2] but i only support [2] in the case of ceph cli, the error would look like: [errno 13] RADOS permission denied (error connecting to the cluster) so, to address this issue, the EACCES error is ignored when waiting for a quorum. Signed-off-by: Kefu Chai <kchai@redhat.com> (cherry picked from commit 7afd38f)
Contributor
Author
|
hi @yuriw , not sure if you are still planning yet another round of nautilus qa batch. if yes, would you kindly include this change as well? if not, i'd keep it around in case it could help the community. |
Contributor
|
@tchaikov I can't build it https://shaman.ceph.com/builds/ceph/wip-yuri3-testing-2021-06-16-0702-nautilus/ see if you can resolve and I will retest |
Contributor
Author
|
@yuriw thank you for testing! that's a known issue. and has been fixed. i am building the branch at https://shaman.ceph.com/builds/ceph/wip-yuri3-testing-2021-06-16-0702-nautilus-1/8aebecf8580d16fe26fb7ac2c2317d240257e596/. |
Member
|
jenkins test make check |
ideepika
approved these changes
Jun 17, 2021
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
backport of #27465, the cleanup and doc changes are dropped. as they are not necessary for the bug fix.
backport ticket: https://tracker.ceph.com/issues/51237
Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume tox