mds: drop client metrics during recovery#57084
Conversation
Fixes: https://tracker.ceph.com/issues/65660 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
|
jenkins test make check |
| { | ||
| dout(1) << "active_start" << dendl; | ||
|
|
||
| m_is_active = true; |
There was a problem hiding this comment.
I don't see where you are resetting it. I'd suggest that you put the code at the end of handle_mds_map:
m_is_active = is_active();
There was a problem hiding this comment.
IMO no need to reset it.
@batrick BTW, why not just check the MDS' state from the mdsmap instead of adding a new m_is_active ? For lockless case ?
There was a problem hiding this comment.
You need the mds_lock to look at the MDSMap. We don't want the metrics aggregator to be acquiring that lock generally.
|
I appreciate that as of today an instance of MDSRank may never experience a transition from active -> inactive in a way that would affect this code. However, relying on that fact here creates an implicit dependency on how MDSRank instances should be managed that isn't unit-tested (as of today). We should try to minimize such implicit dependencies, IMO |
|
|
@batrick I think this change will fix the case that when connecting to the old clients, which haven't included my previous fixes. |
|
@batrick Picking this PR for QA |
|
This PR is under test in https://tracker.ceph.com/issues/66125. |
leonid-s-usov
left a comment
There was a problem hiding this comment.
I understand that MDS can't deactivate except for shutting down, hence I'll approve as-is.
To reduce ambiguity for future readers, who, like me, could be confused by the naming and why this state never gets reset, I'd suggest the name: m_did_activate. IMO this better encodes the one-way nature of the flag.
|
This PR is under test in https://tracker.ceph.com/issues/66162. |
|
@rishabh-d-dave status? |
First, 4/5 builds failed because there were failure due to infra issues. Then runs had 39 infra-related failure that were persistent even in new runs. Then re-runs couldn't be launched due to issue in |
|
Removing my testing label for since we have infra issues again. |
|
This PR is under test in https://tracker.ceph.com/issues/66433. |
|
This PR is under test in https://tracker.ceph.com/issues/66462. |
* refs/pull/57084/head: mds: drop client metrics during recovery
|
This PR is under test in https://tracker.ceph.com/issues/66609. |
Fixes: https://tracker.ceph.com/issues/65660
Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windowsjenkins test rook e2e