mds: getattr just waits the xlock to be released by the previous client by lxbsz · Pull Request #56602 · ceph/ceph

lxbsz · 2024-04-01T01:38:50Z

When the previous client's setattr request is still holding the xlock for the linklock/authlock/xattrlock/filelock locks, if the same client send a getattr request it will use the projected inode to fill the reply, while for other clients the getattr requests will use the none projected inode to fill replies. This just cause inconsistent file mode across multiple clients.

This will just let the getattr wait until the previous client release the xlock.

Fixes: https://tracker.ceph.com/issues/63906

Contribution Guidelines

To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

Tracker (select at least one)
- References tracker ticket
- Very recent bug; references commit where it was introduced
- New feature (ticket optional)
- Doc update (no ticket needed)
- Code cleanup (no ticket needed)
Component impact
- Affects Dashboard, opened tracker ticket
- Affects Orchestrator, opened tracker ticket
- No impact that needs to be tracked
Documentation (select at least one)
- Updates relevant documentation
- No doc update is appropriate
Tests (select at least one)
- Includes unit test(s)
- Includes integration test(s)
- Includes bug reproducer
- No tests

Show available Jenkins commands

jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox
jenkins test windows
jenkins test rook e2e

src/mds/Server.cc

mchangir · 2024-04-25T06:54:24Z

also, possibly update the PR title and the commit title to:
mds: always make getattr wait for xlock to be released by the previous client

lxbsz · 2024-04-25T06:59:07Z

mds: always make getattr wait for xlock to be released by the previous client

This looks good to me and fixed them all. Thanks @mchangir

mchangir · 2024-04-25T07:19:40Z

... none projected inode to fill replies. This just cause inconsistent file mode across multiple clients.

please change this to: non-projected
please change the last sentence to: This causes inconsistent ...

leonid-s-usov

I'd like to discuss this. The code doesn't look right, it's trying to hack the locking system. We don't want to add such code outside of the Locker, and even there it's smelly.

Using the projected inode in requests from the holder of the xlock is a feature: it means that the client gets back the recent changes even if they haven't been committed yet.

Going back to the ticket I'm not sure we're fixing the issue at the correct place. If chmod in POSIX is a synchronous operation, then we should've waited for the change to be committed, before we allowed another query to propagate to the MDS.
I haven't yet found out whether fsync is required after a chmod, but in any case, I doubt that we need to fix it on the MDS side

leonid-s-usov

I forgot to request changes. See my previous review comment

lxbsz · 2024-04-26T00:52:13Z

I'd like to discuss this. The code doesn't look right, it's trying to hack the locking system. We don't want to add such code outside of the Locker, and even there it's smelly.

Then where should it be ?

Using the projected inode in requests from the holder of the xlock is a feature: it means that the client gets back the recent changes even if they haven't been committed yet.

Yeah, correct, but this will only readable by the client holding the xlock, but for the other clients they are still reading the old metadata, this apparently violates the POSIX semantic comparing with the local filesystems.

Going back to the ticket I'm not sure we're fixing the issue at the correct place. If chmod in POSIX is a synchronous operation, then we should've waited for the change to be committed, before we allowed another query to propagate to the MDS.

Yeah, this change is obviously doing this. This should be opaque to users and we shouldn't fail it by telling users that MDS is not ready yet and you should wait for a while to do the query.

I haven't yet found out whether fsync is required after a chmod, but in any case, I doubt that we need to fix it on the MDS side

No, this isn't a must, we shouldn't ask users to do this because the POSIX doesn't mention it. Even we do fsync here it won't resolve this issue, because the chmod and fsync are not atomic and then we couldn't guarantee that when another client is trying to query the fsync has already finished. The most important thing is that for the most rhel-8 and other old clients the fsync will only flush the client side cache to MDS, and won't guarantee it will flush the MDS' MDLog to the pool.

@leonid-s-usov Any better approach to resolve this ?

gregsfortytwo

Hmm I agree with Leonid here, we should definitely not be poking into the locking system.
Looking at the tracker ticket, @lxbsz notes that we are batching the getattr from two separate clients, one of which has the xlock (so it can see the projected inode, and the other client can't). I suspect this is the root cause of the issue — a non-xlocker should not read out-of-date inodes; the locking system should block that data access until the inode is stable.

This sounds to me like a bug with the batching implementation, which I have not investigated. :( But I suspect we need to adjust the batching system so that it doesn't batch operations from clients with different caps?

gregsfortytwo · 2024-04-26T01:31:11Z

If this isn't something caused by batching, then it is very scary. You can see down at https://github.com/ceph/ceph/pull/56602/files#diff-277dc6e796ccecb6aa14c9357f7a86898d6ddcf4113c110a4e00e0a10f6fefa7R4189 where we require the rdlock on the authlock, and I believe that should prevent exactly the read-stable-while-there-is-a-projected-inode issue described in the ticket?

lxbsz · 2024-04-26T01:40:07Z

... none projected inode to fill replies. This just cause inconsistent file mode across multiple clients.

please change this to: non-projected

please change the last sentence to: This causes inconsistent ...

Done, Thanks @mchangir

lxbsz · 2024-04-26T01:50:11Z

Hmm I agree with Leonid here, we should definitely not be poking into the locking system. Looking at the tracker ticket, @lxbsz notes that we are batching the getattr from two separate clients, one of which has the xlock (so it can see the projected inode, and the other client can't). I suspect this is the root cause of the issue — a non-xlocker should not read out-of-date inodes; the locking system should block that data access until the inode is stable.

This sounds to me like a bug with the batching implementation, which I have not investigated. :( But I suspect we need to adjust the batching system so that it doesn't batch operations from clients with different caps?

It has already done this. Only for the requests having the same mask will they be batched. Maybe we can just avoid batching non-xlocked requests to the xlocked one.

gregsfortytwo · 2024-04-26T01:52:20Z

@batrick there's a FIXME in handle_client_getattr stemming from #27866 — do you have any idea what's going on there?

I don't really understand how what @lxbsz described in https://tracker.ceph.com/issues/63906#note-9 is possible — you can clearly see Server::handle_client_getattr() invoking rdlock_path_pin_ref and then

if ((mask & CEPH_CAP_AUTH_SHARED) && !(issued & CEPH_CAP_AUTH_EXCL))
  lov.add_rdlock(&ref->authlock);
...
if (!mds->locker->acquire_locks(mdr, lov))
  return;

So how can we possibly be returning data while there's a projected inode outstanding?

gregsfortytwo · 2024-04-26T01:56:42Z

Hmm I agree with Leonid here, we should definitely not be poking into the locking system. Looking at the tracker ticket, @lxbsz notes that we are batching the getattr from two separate clients, one of which has the xlock (so it can see the projected inode, and the other client can't). I suspect this is the root cause of the issue — a non-xlocker should not read out-of-date inodes; the locking system should block that data access until the inode is stable.
This sounds to me like a bug with the batching implementation, which I have not investigated. :( But I suspect we need to adjust the batching system so that it doesn't batch operations from clients with different caps?

Maybe we can just avoid batching non-xlocked requests to the xlocked one.

Oh, yes, I said "caps" but I suppose you can still have an xlock assigned to you without different issued caps. This is what I meant.

lxbsz · 2024-04-26T02:53:11Z

@batrick there's a FIXME in handle_client_getattr stemming from #27866 — do you have any idea what's going on there?

I don't really understand how what @lxbsz described in https://tracker.ceph.com/issues/63906#note-9 is possible — you can clearly see Server::handle_client_getattr() invoking rdlock_path_pin_ref and then

I just made a mistake and mislead you in our 1x1, yeah it's a bug of the batch ops.

1, clientA does a sync setattr in MDS and then in MDS it holds xlock for authlock.
2, clientA sends a getattr requestA and then it will be marked as batch head.
3, clientB sends a getattr requestB too and then it will be added to the batch since it has the same mask with the getattr from clientA.

So here we need to make sure getattr requestA and getattr requestB won't be batched will be fine. Because each getattr request will try to do rdlock, which will wait the prvious xlock to be released. While the xlock is released the pi will be popped.

if ((mask & CEPH_CAP_AUTH_SHARED) && !(issued & CEPH_CAP_AUTH_EXCL))
  lov.add_rdlock(&ref->authlock);

I think this is because once the xlock is acquired by any client, then all the clients request need to acquire the rdlock here. Only in the excl mode will the corresponding client do not have to acquire the rdlock here.

...
if (!mds->locker->acquire_locks(mdr, lov))
return;
So how can we possibly be returning data while there's a projected inode outstanding?

Once the xlock is held by any client, then for all the non-xlocked clients they need to wait the xlock to be released. While for the xlocked client it could get the latest info from the pi directly and no need to wait.

lxbsz · 2024-04-26T02:58:44Z

Updated the patch and the new change will avoid batching the ops when any of the xlock is held. And then later when the xlock is released they can batch again in the next try.

leonid-s-usov · 2024-04-26T02:59:14Z

I think it all starts in the early_reply. There's this code:

  // mark xlocks "done", indicating that we are exposing uncommitted changes.
  //
  //_rename_finish() does not send dentry link/unlink message to replicas.
  // so do not set xlocks on dentries "done", the xlocks prevent dentries
  // that have projected linkages from getting new replica.
  mds->locker->set_xlocks_done(mdr.get(), req->get_op() == CEPH_MDS_OP_RENAME);

The comment suggests that we should somehow expose projected changes. Now, the set_xlock_done method only resets the xlock_by field, but not the xlock_by_client field:

  void set_xlock_done() {
    ceph_assert(more()->xlock_by);
    ceph_assert(state == LOCK_XLOCK || is_locallock() ||
	   state == LOCK_LOCK /* if we are a peer */);
    if (!is_locallock())
      state = LOCK_XLOCKDONE;
    more()->xlock_by.reset();
  }

and that field is what's causing the pauth 1 in encode_inodestat for only one of the clients:

  bool pauth = authlock.is_xlocked_by_client(client) || get_loner() == client;

That check doesn't comply with the comment in early_reply, maybe that comment is misleading or I misinterpret it. Anyway, by choosing to reply early we create this difference in how we respond to the same client vs other clients, so either we shouldn't respond early, or we should change the encode_inodestat to expose projected node if the xlock is done

lxbsz · 2024-04-26T03:20:46Z

I think it all starts in the early_reply. There's this code:
  // mark xlocks "done", indicating that we are exposing uncommitted changes.
  //
  //_rename_finish() does not send dentry link/unlink message to replicas.
  // so do not set xlocks on dentries "done", the xlocks prevent dentries
  // that have projected linkages from getting new replica.
  mds->locker->set_xlocks_done(mdr.get(), req->get_op() == CEPH_MDS_OP_RENAME);
The comment suggests that we should somehow expose projected changes. Now, the set_xlock_done method only resets the xlock_by field, but not the xlock_by_client field:
  void set_xlock_done() {
    ceph_assert(more()->xlock_by);
    ceph_assert(state == LOCK_XLOCK || is_locallock() ||
	   state == LOCK_LOCK /* if we are a peer */);
    if (!is_locallock())
      state = LOCK_XLOCKDONE;
    more()->xlock_by.reset();
  }
and that field is what's causing the pauth 1 in encode_inodestat for only one of the clients:
  bool pauth = authlock.is_xlocked_by_client(client) || get_loner() == client;
That check doesn't comply with the comment in early_reply, maybe that comment is misleading or I misinterpret it. Anyway, by choosing to reply early we create this difference in how we respond to the same client vs other clients, so either we shouldn't respond early, or we should change the encode_inodestat to expose projected node if the xlock is done

I think the comments is misleading and it should be exposing uncommitted changes between mdrs, not between clients.

leonid-s-usov · 2024-04-26T03:21:01Z

Because each getattr request will try to do rdlock, which will wait the prvious xlock to be released. While the xlock is released the pi will be popped.

The client that holds the xlock will be able to read (projected) immediately in the XLOCKDONE state, while the other will have to wait. But both will return the new projected value.

                      // stable     loner  rep state  r     rp   rd   wr   fwr  l    x    caps,other
    [LOCK_XLOCK]     = { LOCK_SYNC, false, LOCK_LOCK, 0,    XCL, 0,   0,   0,   0,   0,   0,0,0,0 },
    [LOCK_XLOCKDONE] = { LOCK_SYNC, false, LOCK_LOCK, XCL,  XCL, XCL, 0,   0,   XCL, 0,   0,0,CEPH_CAP_GSHARED,0 },

  bool can_read(client_t client) const {
    return get_sm()->states[state].can_read == ANY ||
      (get_sm()->states[state].can_read == AUTH && parent->is_auth()) ||
      (get_sm()->states[state].can_read == XCL && client >= 0 && get_xlock_by_client() == client);
  }

You are right, @lxbsz, preventing the batch should help. But I'm still not sure about the early_reply and what it meant by "exposing uncommitted changes". Is that just for the xclocking client to be able to rdlock?

lxbsz · 2024-04-26T03:24:13Z

Because each getattr request will try to do rdlock, which will wait the prvious xlock to be released. While the xlock is released the pi will be popped.

The client that holds the xlock will be able to read immediately, while the other will wait. But both will return the new projected value.
                      // stable     loner  rep state  r     rp   rd   wr   fwr  l    x    caps,other
    [LOCK_XLOCK]     = { LOCK_SYNC, false, LOCK_LOCK, 0,    XCL, 0,   0,   0,   0,   0,   0,0,0,0 },
    [LOCK_XLOCKDONE] = { LOCK_SYNC, false, LOCK_LOCK, XCL,  XCL, XCL, 0,   0,   XCL, 0,   0,0,CEPH_CAP_GSHARED,0 },
  bool can_read(client_t client) const {
    return get_sm()->states[state].can_read == ANY ||
      (get_sm()->states[state].can_read == AUTH && parent->is_auth()) ||
      (get_sm()->states[state].can_read == XCL && client >= 0 && get_xlock_by_client() == client);
  }
You are right, @lxbsz, preventing the batch should help. But I'm still not sure about the early_reply and what it meant by "exposing uncommitted changes". Is that just for the xclocking client to be able to rdlock?

   2   8492  mds/MDCache.cc <<GLOBAL>>
             !(dn->lock.is_xlocked() && dn->lock.get_xlock_by() == mdr)) {

The get_xlock_by() is called in the path_traverse() and trying to expose the dn between mdrs. So my understanding is it will try to expose the uncommitted changes between mdrs in the same client.

leonid-s-usov · 2024-04-26T03:59:33Z

src/mds/Server.cc

+      if (((mask & CEPH_CAP_LINK_SHARED) && (in->linklock.is_xlocked())) ||
+          ((mask & CEPH_CAP_AUTH_SHARED) && (in->authlock.is_xlocked())) ||
+          ((mask & CEPH_CAP_XATTR_SHARED) && (in->xattrlock.is_xlocked())) ||
+          ((mask & CEPH_CAP_FILE_SHARED) && (in->filelock.is_xlocked()))) {


We can reduce impact by making sure that the xlocker is never batched. This will allow batching multiple non-xlocker clients even if some of the xlocks are held, which should be safe as long as the xlocker isn't the batch head.

Suggested change

if (((mask & CEPH_CAP_LINK_SHARED) && (in->linklock.is_xlocked())) ||

((mask & CEPH_CAP_AUTH_SHARED) && (in->authlock.is_xlocked())) ||

((mask & CEPH_CAP_XATTR_SHARED) && (in->xattrlock.is_xlocked())) ||

((mask & CEPH_CAP_FILE_SHARED) && (in->filelock.is_xlocked()))) {

if (((mask & CEPH_CAP_LINK_SHARED) && (in->linklock.is_xlocked_by_client(client))) ||

((mask & CEPH_CAP_AUTH_SHARED) && (in->authlock.is_xlocked_by_client(client))) ||

((mask & CEPH_CAP_XATTR_SHARED) && (in->xattrlock.is_xlocked_by_client(client))) ||

((mask & CEPH_CAP_FILE_SHARED) && (in->filelock.is_xlocked_by_client(client)))) {

Yeah, this will be better.

Fixed it and thanks @leonid-s-usov @gregsfortytwo

Wait, please double check, my patch above is probably wrong, but the idea stays: don't batch when the xlocker client is the head.

This is fine. All the xlocker client requests won't be batched and even won't be set as batch heads.

With this change all the non-locker client requests could be batched and the batch heads will acquire the rdlock and then wait.

…s client When the previous client's setattr request is still holding the xlock for the linklock/authlock/xattrlock/filelock locks, if the same client send a getattr request it will use the projected inode to fill the reply, while for other clients the getattr requests will use the non-projected inode to fill replies. This causes inconsistent file mode across multiple clients. This will just skip batching the ops when any of the xlock is held. Fixes: https://tracker.ceph.com/issues/63906 Signed-off-by: Xiubo Li <xiubli@redhat.com>

leonid-s-usov

LGTM

gregsfortytwo

We are rapidly approaching the point where I want us to move the batching into a proper interface instead of open-coding it, but this LGTM for now. :)

github-actions · 2024-07-01T16:01:56Z

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

vshankar · 2024-07-01T16:11:18Z

jenkins test api

batrick · 2024-07-02T01:01:24Z

@batrick there's a FIXME in handle_client_getattr stemming from #27866 — do you have any idea what's going on there?

I don't know why Zheng put those there, sorry.

batrick

Yet another example where synthetic cap requests would be really handy for testing.

batrick · 2024-07-02T01:49:47Z

src/mds/Server.cc

+          ((mask & CEPH_CAP_AUTH_SHARED) && (in->authlock.is_xlocked_by_client(client))) ||
+          ((mask & CEPH_CAP_XATTR_SHARED) && (in->xattrlock.is_xlocked_by_client(client))) ||
+          ((mask & CEPH_CAP_FILE_SHARED) && (in->filelock.is_xlocked_by_client(client)))) {
+	r = -1;


There is another bug I think. This won't handle the case where we're skipping e.g. a rdlock when the client is issued an exclusive cap, like Ax. Consider:

client 1: issued pAx
client 2: getattr pAsLsXsFs issued pLxXx
client 3: getattr pAsLsXsFs issued p

client 2 will skip acquiring the linklock/xattrlock because it's issued LxXx. It will block on recall of Ax from client 1.

Client 2's getattr cannot be a batch head for client 3's `getattr.

This all exposes a problem I believe exists with where we're constructing the batch head: this should be constructed only after this acquire_locks fails:

ceph/src/mds/Server.cc

Lines 4217 to 4218 in 2c16096

if (!mds->locker->acquire_locks(mdr, lov))

return;

Before that point, record whether we've skipped any locks due to issued caps or (for this particular bug) the lock is already xlocked by the client. The latter cannot be easily checked without looking at the inode's lock because the Locker state machine hides why a rdlock succeeds (in this case, the client has an xlock already).

There is another bug I think. This won't handle the case where we're skipping e.g. a rdlock when the client is issued an exclusive cap, like Ax. Consider:

client 1: issued pAx client 2: getattr pAsLsXsFs issued pLxXx client 3: getattr pAsLsXsFs issued p

@batrick Correct me if I am wrong here.

From mds/lock.c I can see that only the loner client could get the x caps. Could you point me out in which case could we see the x in different locks will be issued to different clients at the same time ?

Different locks could have different loners at the same time ?

There is another bug I think. This won't handle the case where we're skipping e.g. a rdlock when the client is issued an exclusive cap, like Ax. Consider:
client 1: issued pAx client 2: getattr pAsLsXsFs issued pLxXx client 3: getattr pAsLsXsFs issued p

@batrick Correct me if I am wrong here.

From mds/lock.c I can see that only the loner client could get the x caps. Could you point me out in which case could we see the x in different locks will be issued to different clients at the same time ?

You might be right; I'm not sure. In principle I don't see why it wouldn't be allowed but the state diagram suggests it is not.

Different locks could have different loners at the same time ?

It doesn't look like it but it's worth checking.

I think the batch leader construction should still be moved however. And that "detail" of the loner client shouldn't be relied on for the checks in any case.

There is another bug I think. This won't handle the case where we're skipping e.g. a rdlock when the client is issued an exclusive cap, like Ax. Consider:
client 1: issued pAx client 2: getattr pAsLsXsFs issued pLxXx client 3: getattr pAsLsXsFs issued p

@batrick Correct me if I am wrong here.
From mds/lock.c I can see that only the loner client could get the x caps. Could you point me out in which case could we see the x in different locks will be issued to different clients at the same time ?

You might be right; I'm not sure. In principle I don't see why it wouldn't be allowed but the state diagram suggests it is not.

Different locks could have different loners at the same time ?

It doesn't look like it but it's worth checking.

I think the batch leader construction should still be moved however. And that "detail" of the loner client shouldn't be relied on for the checks in any case.

I couldn't remember I ever saw this case, just now I checked some debug logs, such as the debug logs from:

ceph-post-file: fb9a96f9-5f6d-46b4-b1fa-8580928b2241

I didn't find any case will do this. And also by going through the mds code, only when a longer is successfully set will the EXCL locker state could be set. And all the lockers in a CInode will be set to the same single longer.

Then perhaps the better check is: "is this client the loner for the CInode? In that case, do not make it a batch head".

Let's move the batch code below this call to acquire_locks on failure:

ceph/src/mds/Server.cc

Lines 4217 to 4218 in 2c16096

if (!mds->locker->acquire_locks(mdr, lov))

return;

If we fail to acquire the locks, then make it the batch head if one does not exist. If a batch head does exist already, then drop locks and add it to the batch queue.

Let's move the batch code below this call to acquire_locks on failure:

ceph/src/mds/Server.cc

Lines 4217 to 4218 in 2c16096

if (!mds->locker->acquire_locks(mdr, lov))

return;

If we fail to acquire the locks, then make it the batch head if one does not exist. If a batch head does exist already, then drop locks and add it to the batch queue.

Sure.

Checked the code carefully again, if my understanding is correct this won't work well as expected ?

For example, if the first lookup request just wants to acquire the rdlock for linklock and it succeeds, and then later the second lookup request comes and also just wants to the rdlock for the linklock. If both these two requests succeeds there won't be any chance to batch them. Actually we should batch them, right ?

There are no further waits after that acquire_locks so I don't think so?

Yeah, I think you should right here. Let me check it more.

@batrick If we will move the batch code after acquire_locks, we also should adjust the acquire_locks and other callers to make sure that they won't add the current request to any waiter, which will retry the request later, and then try to batch this request.

Hm, that's right. Let's leave it this way then.

batrick · 2024-07-03T14:27:09Z

src/mds/Server.cc

+          ((mask & CEPH_CAP_AUTH_SHARED) && (in->authlock.is_xlocked_by_client(client))) ||
+          ((mask & CEPH_CAP_XATTR_SHARED) && (in->xattrlock.is_xlocked_by_client(client))) ||
+          ((mask & CEPH_CAP_FILE_SHARED) && (in->filelock.is_xlocked_by_client(client)))) {
+	r = -1;


Then perhaps the better check is: "is this client the loner for the CInode? In that case, do not make it a batch head".

Let's move the batch code below this call to acquire_locks on failure:

ceph/src/mds/Server.cc

Lines 4217 to 4218 in 2c16096

if (!mds->locker->acquire_locks(mdr, lov))

return;

If we fail to acquire the locks, then make it the batch head if one does not exist. If a batch head does exist already, then drop locks and add it to the batch queue.

vshankar · 2024-07-05T16:11:33Z

This PR is under test in https://tracker.ceph.com/issues/66850.

vshankar · 2024-07-23T14:38:04Z

This PR is under test in https://tracker.ceph.com/issues/67089.

vshankar · 2024-07-30T12:29:17Z

I'm seeing some new failures in the branch which this PR is a part of. Trying to isolate the problematic change. Will update when done.

vshankar · 2024-08-02T09:19:47Z

This PR is under test in https://tracker.ceph.com/issues/67318.

batrick · 2024-08-05T18:46:31Z

src/mds/Server.cc

+          ((mask & CEPH_CAP_AUTH_SHARED) && (in->authlock.is_xlocked_by_client(client))) ||
+          ((mask & CEPH_CAP_XATTR_SHARED) && (in->xattrlock.is_xlocked_by_client(client))) ||
+          ((mask & CEPH_CAP_FILE_SHARED) && (in->filelock.is_xlocked_by_client(client)))) {
+	r = -1;


Hm, that's right. Let's leave it this way then.

* refs/pull/56602/head: mds: always make getattr wait for xlock to be released by the previous client Reviewed-by: Leonid Usov <leonid.usov@ibm.com> Reviewed-by: Greg Farnum <gfarnum@redhat.com>

* refs/pull/56602/head: mds: always make getattr wait for xlock to be released by the previous client Reviewed-by: Greg Farnum <gfarnum@redhat.com> Reviewed-by: Patrick Donnelly <pdonnell@redhat.com> Reviewed-by: Leonid Usov <leonid.usov@ibm.com>

vshankar

https://tracker.ceph.com/projects/cephfs/wiki/Main#wip-vshankar-testing-20240814051955-debug

lxbsz requested a review from a team April 1, 2024 01:38

github-actions bot added the cephfs Ceph File System label Apr 1, 2024

mchangir reviewed Apr 25, 2024

View reviewed changes

src/mds/Server.cc Outdated Show resolved Hide resolved

src/mds/Server.cc Outdated Show resolved Hide resolved

lxbsz force-pushed the wip-63906 branch from 450974e to 213940b Compare April 25, 2024 06:58

leonid-s-usov reviewed Apr 25, 2024

View reviewed changes

leonid-s-usov suggested changes Apr 25, 2024

View reviewed changes

gregsfortytwo requested changes Apr 26, 2024

View reviewed changes

lxbsz force-pushed the wip-63906 branch from 213940b to c3be492 Compare April 26, 2024 01:39

lxbsz force-pushed the wip-63906 branch from c3be492 to c206e69 Compare April 26, 2024 02:57

leonid-s-usov reviewed Apr 26, 2024

View reviewed changes

lxbsz force-pushed the wip-63906 branch from c206e69 to fcd3cf3 Compare April 26, 2024 04:10

lxbsz force-pushed the wip-63906 branch from fcd3cf3 to b1ea37c Compare April 26, 2024 04:11

leonid-s-usov approved these changes Apr 26, 2024

View reviewed changes

gregsfortytwo approved these changes May 2, 2024

View reviewed changes

github-actions bot added the stale label Jul 1, 2024

vshankar added wip-vshankar-testing1 and removed stale labels Jul 1, 2024

batrick requested changes Jul 2, 2024

View reviewed changes

batrick requested changes Jul 3, 2024

View reviewed changes

vshankar added wip-vshankar-testing2 and removed wip-vshankar-testing1 labels Jul 23, 2024

vshankar added wip-vshankar-testing5 and removed wip-vshankar-testing2 labels Aug 1, 2024

batrick approved these changes Aug 5, 2024

View reviewed changes

vshankar approved these changes Aug 22, 2024

View reviewed changes

vshankar merged commit 1606e7f into ceph:main Aug 23, 2024

vshankar removed the wip-vshankar-testing5 label Aug 23, 2024

Conversation

lxbsz commented Apr 1, 2024

Contribution Guidelines

Checklist

Uh oh!

Uh oh!

Uh oh!

mchangir commented Apr 25, 2024

Uh oh!

lxbsz commented Apr 25, 2024

Uh oh!

mchangir commented Apr 25, 2024

Uh oh!

leonid-s-usov left a comment

Choose a reason for hiding this comment

Uh oh!

leonid-s-usov left a comment

Choose a reason for hiding this comment

Uh oh!

lxbsz commented Apr 26, 2024

Uh oh!

gregsfortytwo left a comment

Choose a reason for hiding this comment

Uh oh!

gregsfortytwo commented Apr 26, 2024

Uh oh!

lxbsz commented Apr 26, 2024

Uh oh!

lxbsz commented Apr 26, 2024

Uh oh!

gregsfortytwo commented Apr 26, 2024

Uh oh!

gregsfortytwo commented Apr 26, 2024

Uh oh!

lxbsz commented Apr 26, 2024

Uh oh!

lxbsz commented Apr 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leonid-s-usov commented Apr 26, 2024

Uh oh!

lxbsz commented Apr 26, 2024

Uh oh!

leonid-s-usov commented Apr 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lxbsz commented Apr 26, 2024

Uh oh!

leonid-s-usov Apr 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

leonid-s-usov left a comment

Choose a reason for hiding this comment

Uh oh!

gregsfortytwo left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Jul 1, 2024

Uh oh!

vshankar commented Jul 1, 2024

Uh oh!

batrick commented Jul 2, 2024

Uh oh!

batrick left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lxbsz Jul 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

lxbsz commented Apr 26, 2024 •

edited

Loading

leonid-s-usov commented Apr 26, 2024 •

edited

Loading

leonid-s-usov Apr 26, 2024 •

edited

Loading

lxbsz Jul 3, 2024 •

edited

Loading

lxbsz Jul 4, 2024 •

edited

Loading