Skip to content

mds: add optimization for replica recall during quiesce#56867

Closed
batrick wants to merge 1 commit intoceph:mainfrom
batrick:quiesce-no-locks
Closed

mds: add optimization for replica recall during quiesce#56867
batrick wants to merge 1 commit intoceph:mainfrom
batrick:quiesce-no-locks

Conversation

@batrick
Copy link
Member

@batrick batrick commented Apr 12, 2024

Need to update tests.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
  • Component impact
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows
  • jenkins test rook e2e

@batrick batrick added the cephfs Ceph File System label Apr 12, 2024
@batrick batrick requested a review from leonid-s-usov April 12, 2024 22:29
Copy link
Contributor

@leonid-s-usov leonid-s-usov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like this approach, it's the simplest possible interface for quiesce, as we discussed here

Then we didn't go for it because we weren't sure how much of a refactoring that would be to make the Locker re-eval all locks and revoke the unwanted caps.

Now I'm a little worried that without causing an explicit transition of locks to known states before we drop the locks may get us into a pool of hard to debug transient issues when unstable locks will suddenly have to re-eval with a different cap set, potentially incompatible with their expectations.

I hope my worries are just that, in which case I would prefer this PR over my variant at #56755, with just one condition: we should avoid using any other locks like `policylock, and have the Locker call us back just because it's seen that we are xlocking the quiesce lock.

@batrick
Copy link
Member Author

batrick commented Apr 13, 2024

I really like this approach, it's the simplest possible interface for quiesce, as we discussed here

Then we didn't go for it because we weren't sure how much of a refactoring that would be to make the Locker re-eval all locks and revoke the unwanted caps.

Now I'm a little worried that without causing an explicit transition of locks to known states before we drop the locks may get us into a pool of hard to debug transient issues when unstable locks will suddenly have to re-eval with a different cap set, potentially incompatible with their expectations.

It's my concern as well because there may be subtle gotchas. I already found one case where quiesce did not complete with this code but I had forgotten client debugging so I could not diagnose.

It's possible with some adjustments this will work fine as-is. Perhaps we can also keep the locks but only use Locker::issue_caps for the regular file case (removing the filelock).

@batrick
Copy link
Member Author

batrick commented Apr 13, 2024

It's possible with some adjustments this will work fine as-is. Perhaps we can also keep the locks but only use Locker::issue_caps for the regular file case (removing the filelock).

I've made this change but I'm actively testing the old approach to see what caused the quiesce to not complete.

@leonid-s-usov
Copy link
Contributor

leonid-s-usov commented Apr 13, 2024

That last change makes this less pretty, TBH. Also, please see slack, I have now a different opinion on taking locks on non-auth:

We cannot use an xlock on the filelock because it can only be acquired by the auth mds.

I believe that replica will update the lock states to LOCK_LOCK automatically as the auth will take rd/x locks on the caps-related locks, so the whole thing could be simpler by only taking the quiesce lock on replica and ignore the other locks, see my addition to the xlock PR

@leonid-s-usov
Copy link
Contributor

I've made this change but I'm actively testing the old approach to see what caused the quiesce to not complete.

Could it be a policylock-related deadlock? If we have to take the policylock, we should probably call

  bool xlock_policylock(const MDRequestRef& mdr, CInode *in,
			bool want_layout=false, bool xlock_snaplock=false);

@batrick batrick changed the title mds: re-issue caps after quiescelock acquired mds: add optimization for replica recall during quiesce Apr 15, 2024
@batrick
Copy link
Member Author

batrick commented Apr 15, 2024

I've updated this code as discussed in our meeting. The change is in anticipation of the changes you're working on Leonid in #56755.

@github-actions
Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

// as a result of the auth taking the above locks.

/* kick cap issuance without waiting for auth */
if (quiesce_replica_recall && !in->is_auth()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

&& !in->is_auth() is redundant, we are already in the else branch of the inverted test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, this was just the state after a trivial rebase, see now

- mds
flags:
- runtime
- name: mds_cache_quiesce_auth_recall
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this one is used anywhere, is it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

look again

Copy link
Contributor

@leonid-s-usov leonid-s-usov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it will make a difference, there should not be any W caps issued by the replicas, if I understand correctly. Otherwise, this looks good.

@batrick batrick force-pushed the quiesce-no-locks branch 2 times, most recently from 1f800ae to ab7d084 Compare June 3, 2024 15:51
Copy link
Contributor

@leonid-s-usov leonid-s-usov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, this looks more to the point now. I still have some comments, see inline

@batrick batrick force-pushed the quiesce-no-locks branch 2 times, most recently from 73a86e6 to ec69f0f Compare June 3, 2024 18:03
@batrick
Copy link
Member Author

batrick commented Jun 3, 2024

Since we're still taking locks, this doesn't require adjusting tests. It's also off by-default. I may like to use it for some performance measurements.

@batrick batrick marked this pull request as ready for review June 3, 2024 18:04
Copy link
Contributor

@leonid-s-usov leonid-s-usov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@batrick batrick force-pushed the quiesce-no-locks branch from ec69f0f to a7952fd Compare June 3, 2024 19:57
Once the quiescelock is held, the replica can proceed with recalling caps so
the auth can more rapidly acquire locks when it attempts to quiesce.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
@batrick batrick force-pushed the quiesce-no-locks branch from a7952fd to cc91867 Compare June 3, 2024 19:58
@leonid-s-usov
Copy link
Contributor

@batrick Here we are calling issue caps before attempting the locks. How does this address those edge cases you mentioned, when locks may be caught in some transitory states by this caps revocation?

@batrick
Copy link
Member Author

batrick commented Jun 4, 2024

@batrick Here we are calling issue caps before attempting the locks.

This is an itneresting question. The only two ways to go that makes sense to me is:

  • Use caps recall instead of locks altogether (perhaps that should be a different config in this PR)
  • Do caps recall before acquiring locks (to ensure we begin revoking all related caps immediately); that's what we're doing here.

How does this address those edge cases you mentioned, when locks may be caught in some transitory states by this caps revocation?

I'm not sure there is an unconsidered edge case. The code has changed significantly since I first posted this PR. The deadlock may be gone entirely. It may have been something else entirely. I didn't get to dig into it much due to lack of debugging.

What I would like is to get this PR in a state where we can test (performance of) both of the above two cases with config changes but those configs are off by default.

@batrick batrick marked this pull request as draft June 28, 2024 20:56
@batrick
Copy link
Member Author

batrick commented Aug 9, 2024

I think we'll forgo this. We already mask the caps allows when the quiescelock is xlocked. I don't think this really does anything useful.

@batrick batrick closed this Aug 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cephfs Ceph File System common

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants