mds: add optimization for replica recall during quiesce by batrick · Pull Request #56867 · ceph/ceph

batrick · 2024-04-12T22:29:24Z

Need to update tests.

Checklist

Tracker (select at least one)
- References tracker ticket
Component impact
- No impact that needs to be tracked
Documentation (select at least one)
- No doc update is appropriate
Tests (select at least one)
- Includes unit test(s)

Show available Jenkins commands

jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox
jenkins test windows
jenkins test rook e2e

leonid-s-usov

I really like this approach, it's the simplest possible interface for quiesce, as we discussed here

Then we didn't go for it because we weren't sure how much of a refactoring that would be to make the Locker re-eval all locks and revoke the unwanted caps.

Now I'm a little worried that without causing an explicit transition of locks to known states before we drop the locks may get us into a pool of hard to debug transient issues when unstable locks will suddenly have to re-eval with a different cap set, potentially incompatible with their expectations.

I hope my worries are just that, in which case I would prefer this PR over my variant at #56755, with just one condition: we should avoid using any other locks like `policylock, and have the Locker call us back just because it's seen that we are xlocking the quiesce lock.

src/mds/MDCache.cc

batrick · 2024-04-13T15:44:00Z

I really like this approach, it's the simplest possible interface for quiesce, as we discussed here

Then we didn't go for it because we weren't sure how much of a refactoring that would be to make the Locker re-eval all locks and revoke the unwanted caps.

Now I'm a little worried that without causing an explicit transition of locks to known states before we drop the locks may get us into a pool of hard to debug transient issues when unstable locks will suddenly have to re-eval with a different cap set, potentially incompatible with their expectations.

It's my concern as well because there may be subtle gotchas. I already found one case where quiesce did not complete with this code but I had forgotten client debugging so I could not diagnose.

It's possible with some adjustments this will work fine as-is. Perhaps we can also keep the locks but only use Locker::issue_caps for the regular file case (removing the filelock).

batrick · 2024-04-13T16:08:32Z

It's possible with some adjustments this will work fine as-is. Perhaps we can also keep the locks but only use Locker::issue_caps for the regular file case (removing the filelock).

I've made this change but I'm actively testing the old approach to see what caused the quiesce to not complete.

leonid-s-usov · 2024-04-13T22:13:03Z

That last change makes this less pretty, TBH. Also, please see slack, I have now a different opinion on taking locks on non-auth:

We cannot use an xlock on the filelock because it can only be acquired by the auth mds.

I believe that replica will update the lock states to LOCK_LOCK automatically as the auth will take rd/x locks on the caps-related locks, so the whole thing could be simpler by only taking the quiesce lock on replica and ignore the other locks, see my addition to the xlock PR

leonid-s-usov · 2024-04-14T08:59:44Z

I've made this change but I'm actively testing the old approach to see what caused the quiesce to not complete.

Could it be a policylock-related deadlock? If we have to take the policylock, we should probably call

  bool xlock_policylock(const MDRequestRef& mdr, CInode *in,
			bool want_layout=false, bool xlock_snaplock=false);

batrick · 2024-04-15T14:49:57Z

I've updated this code as discussed in our meeting. The change is in anticipation of the changes you're working on Leonid in #56755.

github-actions · 2024-04-18T01:13:24Z

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

leonid-s-usov · 2024-06-01T05:27:03Z

src/mds/MDCache.cc

      // as a result of the auth taking the above locks.
+
+      /* kick cap issuance without waiting for auth */
+      if (quiesce_replica_recall && !in->is_auth()) {


&& !in->is_auth() is redundant, we are already in the else branch of the inverted test.

right, this was just the state after a trivial rebase, see now

leonid-s-usov · 2024-06-01T05:29:04Z

src/common/options/mds.yaml.in

  - mds
  flags:
  - runtime
+- name: mds_cache_quiesce_auth_recall


I don't think this one is used anywhere, is it?

leonid-s-usov

I'm not sure if it will make a difference, there should not be any W caps issued by the replicas, if I understand correctly. Otherwise, this looks good.

leonid-s-usov

Ok, this looks more to the point now. I still have some comments, see inline

src/common/options/mds.yaml.in

src/mds/MDCache.cc

batrick · 2024-06-03T18:04:25Z

Since we're still taking locks, this doesn't require adjusting tests. It's also off by-default. I may like to use it for some performance measurements.

leonid-s-usov

Looks good.

src/mds/MDSRank.cc

src/mds/MDCache.h

Once the quiescelock is held, the replica can proceed with recalling caps so the auth can more rapidly acquire locks when it attempts to quiesce. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

leonid-s-usov · 2024-06-04T08:29:05Z

@batrick Here we are calling issue caps before attempting the locks. How does this address those edge cases you mentioned, when locks may be caught in some transitory states by this caps revocation?

batrick · 2024-06-04T18:43:36Z

@batrick Here we are calling issue caps before attempting the locks.

This is an itneresting question. The only two ways to go that makes sense to me is:

Use caps recall instead of locks altogether (perhaps that should be a different config in this PR)
Do caps recall before acquiring locks (to ensure we begin revoking all related caps immediately); that's what we're doing here.

How does this address those edge cases you mentioned, when locks may be caught in some transitory states by this caps revocation?

I'm not sure there is an unconsidered edge case. The code has changed significantly since I first posted this PR. The deadlock may be gone entirely. It may have been something else entirely. I didn't get to dig into it much due to lack of debugging.

What I would like is to get this PR in a state where we can test (performance of) both of the above two cases with config changes but those configs are off by default.

batrick · 2024-08-09T16:15:00Z

I think we'll forgo this. We already mask the caps allows when the quiescelock is xlocked. I don't think this really does anything useful.

batrick added the cephfs Ceph File System label Apr 12, 2024

batrick requested a review from leonid-s-usov April 12, 2024 22:29

batrick force-pushed the quiesce-no-locks branch from ab9fc21 to 04e11ae Compare April 12, 2024 23:15

leonid-s-usov suggested changes Apr 13, 2024

View reviewed changes

src/mds/MDCache.cc Outdated Show resolved Hide resolved

batrick force-pushed the quiesce-no-locks branch from 04e11ae to b1ea57a Compare April 13, 2024 16:00

batrick force-pushed the quiesce-no-locks branch from b1ea57a to 17a0cf8 Compare April 15, 2024 14:48

github-actions bot added the common label Apr 15, 2024

batrick force-pushed the quiesce-no-locks branch from 17a0cf8 to 46ce666 Compare April 15, 2024 14:48

batrick changed the title ~~mds: re-issue caps after quiescelock acquired~~ mds: add optimization for replica recall during quiesce Apr 15, 2024

github-actions bot added the needs-rebase label Apr 18, 2024

batrick force-pushed the quiesce-no-locks branch from 46ce666 to 2bfc4d4 Compare May 31, 2024 16:36

github-actions bot removed the needs-rebase label May 31, 2024

leonid-s-usov reviewed Jun 1, 2024

View reviewed changes

batrick force-pushed the quiesce-no-locks branch 2 times, most recently from 1f800ae to ab7d084 Compare June 3, 2024 15:51

leonid-s-usov suggested changes Jun 3, 2024

View reviewed changes

src/common/options/mds.yaml.in Outdated Show resolved Hide resolved

src/mds/MDCache.cc Show resolved Hide resolved

batrick force-pushed the quiesce-no-locks branch 2 times, most recently from 73a86e6 to ec69f0f Compare June 3, 2024 18:03

batrick marked this pull request as ready for review June 3, 2024 18:04

leonid-s-usov approved these changes Jun 3, 2024

View reviewed changes

src/mds/MDSRank.cc Outdated Show resolved Hide resolved

src/mds/MDCache.h Outdated Show resolved Hide resolved

batrick force-pushed the quiesce-no-locks branch from ec69f0f to a7952fd Compare June 3, 2024 19:57

mds: add optimization for replica recall during quiesce

cc91867

Once the quiescelock is held, the replica can proceed with recalling caps so the auth can more rapidly acquire locks when it attempts to quiesce. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>

batrick force-pushed the quiesce-no-locks branch from a7952fd to cc91867 Compare June 3, 2024 19:58

batrick marked this pull request as draft June 28, 2024 20:56

batrick closed this Aug 9, 2024

Conversation

batrick commented Apr 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Uh oh!

leonid-s-usov left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

batrick commented Apr 13, 2024

Uh oh!

batrick commented Apr 13, 2024

Uh oh!

leonid-s-usov commented Apr 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leonid-s-usov commented Apr 14, 2024

Uh oh!

batrick commented Apr 15, 2024

Uh oh!

github-actions bot commented Apr 18, 2024

Uh oh!

leonid-s-usov Jun 1, 2024

Choose a reason for hiding this comment

Uh oh!

batrick Jun 3, 2024

Choose a reason for hiding this comment

Uh oh!

leonid-s-usov Jun 1, 2024

Choose a reason for hiding this comment

Uh oh!

batrick Jun 3, 2024

Choose a reason for hiding this comment

Uh oh!

leonid-s-usov left a comment

Choose a reason for hiding this comment

Uh oh!

leonid-s-usov left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

batrick commented Jun 3, 2024

Uh oh!

leonid-s-usov left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

leonid-s-usov commented Jun 4, 2024

Uh oh!

batrick commented Jun 4, 2024

Uh oh!

batrick commented Aug 9, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

batrick commented Apr 12, 2024 •

edited

Loading

leonid-s-usov left a comment •

edited

Loading

leonid-s-usov commented Apr 13, 2024 •

edited

Loading