Bug #64977
closedmds spinlock due to lock contention leading to memory exaustion
0%
Description
We have started to observe an MDS spinlock issue on two CephFS clusters running 16.2.9.
On both clusters, the offending pattern generates from backup jobs (done via restic).
The backup clients seem to trigger a stuck operation on the MDS side, which manifests as "failed to rdlock".
The MDS memory balloons (all allocated into buffer_anon) at the rate of ~1GB per second, till OOM.
The stuck operation manifests as the MDS heavily looping on:
2024-03-15T22:27:24.125+0100 7f062bcf7700 7 mds.0.locker sending MClientCaps to client.372200149 seq 103681947 new pending pAsxLsXsxFsxcrwb was pAsLsXsxFsxcrwb
2024-03-15T22:27:24.125+0100 7f062bcf7700 7 mds.0.server dispatch_client_request client_request(client.372777796:588436 lookup #0x1012524ee02/414769741_1-3f4132de_320.tmp 2024-03-15T21:53:54.681734+0100 caller_uid=0, caller_gid=0{0,}) v4
...
2024-03-15T22:27:24.125+0100 7f062bcf7700 7 mds.0.locker sending MClientCaps to client.372200149 seq 103681948 new pending pAsLsXsxFsxcrwb was pAsxLsXsxFsxcrwb
(edited)
client.372200149 is the legit client, serving the application.
client.372777796 is the backup client.
During the heavy loop triggered by lock contention, the MDS struggles to serve other requests, log segments are not trimmed, and the socket may becomes unresponsive.
Evicting the backup client failing to acquire lock releases pressure on the MDS, and the cluster goes back to HEALTH_OK. If the backup client comes back (we do not blocklist on eviction), then history repeats.
Evicting all clients with a lock on the file seems to resolve the situation.
The issue seems to have started with an upgrade of clients to 6.5.5-200.fc38.
Files
Updated by Enrico Bocchi about 2 years ago
Looking further into this reveled that:
- All the clients triggering the issue are kernel clients running 6.5.5-200.fc38.
- We have never seen this happening with the previous kernel client -- 6.4.15.
We diff'ed the kernel sources.
There appear to be changes in
- caps.c, brought in by https://tracker.ceph.com/issues/61332
- dir.c also changed with https://tracker.ceph.com/issues/61584, which we believe causes even more contention due to mtime propagation to parent directory (kernel PR: https://github.com/ceph/ceph-client/commit/d9d00f71ab5a2b5a47b228f678a8817e8687387f)
The new kernel client also incorporates a change in VFS: https://github.com/ceph/ceph-client/commit/3e3271549670783be20e233a2b78a87a0b04c715:
WRAP_DIR_ITER(ceph_readdir) // FIXME!
const struct file_operations ceph_dir_fops = {
.read = ceph_read_dir,
.iterate = ceph_readdir,
.iterate_shared = shared_ceph_readdir,
.llseek = ceph_dir_llseek,
.open = ceph_open,
.release = ceph_release,
Could the MDS spin-lock issue be related to any of the above?
Updated by Enrico Bocchi about 2 years ago
There's also a potential cap sequence mismatch issue tracked at https://tracker.ceph.com/issues/61782
The fix for that did not make it to the kernel client currently in use.
Updated by Enrico Bocchi almost 2 years ago
Found a thread in list.ceph.io [1] reporting the problem we see following an upgrade to 6.5 kernel client.
MDS allocates memory and uses a lot of bandwidth with the offending clients increases (likely due to messenger?)
From the description in the kernel:
> * Concurrency-managed per-cpu work items that hog CPU for longer than > * wq_cpu_intensive_thresh_us trigger the automatic CPU_INTENSIVE mechanism, > * which prevents them from stalling other concurrency-managed work items. If a > * work function keeps triggering this mechanism, it's likely that the work item > * should be using an unbound workqueue instead.
Is this slowing down the process of acquiring/returning caps to the MDS?
Updated by Xiubo Li almost 2 years ago
Hi Enrico,
Do you have more mds side logs ?
Thanks.
Updated by Xiubo Li almost 2 years ago
Enrico Bocchi wrote:
We have started to observe an MDS spinlock issue on two CephFS clusters running 16.2.9.
On both clusters, the offending pattern generates from backup jobs (done via restic).
The backup clients seem to trigger a stuck operation on the MDS side, which manifests as "failed to rdlock".
The MDS memory balloons (all allocated into buffer_anon) at the rate of ~1GB per second, till OOM.The stuck operation manifests as the MDS heavily looping on:
[...]client.372200149 is the legit client, serving the application.
client.372777796 is the backup client.During the heavy loop triggered by lock contention, the MDS struggles to serve other requests, log segments are not trimmed, and the socket may becomes unresponsive.
Evicting the backup client failing to acquire lock releases pressure on the MDS, and the cluster goes back to HEALTH_OK. If the backup client comes back (we do not blocklist on eviction), then history repeats.
Evicting all clients with a lock on the file seems to resolve the situation.The issue seems to have started with an upgrade of clients to 6.5.5-200.fc38.
2024-03-15T22:27:24.125+0100 7f062bcf7700 7 mds.0.locker sending MClientCaps to client.372200149 seq 103681947 new pending pAsxLsXsxFsxcrwb was pAsLsXsxFsxcrwb
2024-03-15T22:27:24.125+0100 7f062bcf7700 7 mds.0.server dispatch_client_request client_request(client.372777796:588436 lookup #0x1012524ee02/414769741_1-3f4132de_320.tmp 2024-03-15T21:53:54.681734+0100 caller_uid=0, caller_gid=0{0,}) v4
...
2024-03-15T22:27:24.125+0100 7f062bcf7700 7 mds.0.locker sending MClientCaps to client.372200149 seq 103681948 new pending pAsLsXsxFsxcrwb was pAsxLsXsxFsxcrwb
(edited)
From the above logs we can see that the client.372200149 just being issued the Ax, but later another client.372777796 lookup request came and it may cause the Ax cap's revocation, and then just in a dead loop for some reasons.
Please provide more mds debug logs with the debug_mds = 25 and debug_ms = 1 enabled.
And if possible please also provide the kclient debug logs.
Thanks
- Xiubo
Updated by Xiubo Li almost 2 years ago
- Project changed from Ceph to CephFS
- Assignee set to Xiubo Li
Updated by Xiubo Li almost 2 years ago
There have two cases will send the lookup request and in the dentry revalidation one it may retry it infinitely, I am still going through the code, but with the logs it will be much easier to found the root cause.
Updated by Xiubo Li almost 2 years ago
- Status changed from New to Need More Info
Updated by Abhishek Lekshmanan almost 2 years ago · Edited
- File cap_revoke.png cap_revoke.png added
We see this pattern during the incidents that both cap revocation and grants spike and then we see the node consuming memory before getting oom-killed. (see attached image)
On the client side there are usually 2 clients involved - one restic/ backup which is mounted ro, and usually an active client for eg a build job. We see in the debug logs a pattern like the following
2024-04-04T17:50:56.825+0200 7f16679d3700 7 mds.0.server dispatch_client_request client_request(client.379194623:32785 lookup #0x1003f0c6738/cgisess_024c841a02411969569b80c0672798c9 2024-04-04T17:49:38.16504
4+0200 caller_uid=0, caller_gid=0{0,}) v4
2024-04-04T17:50:56.825+0200 7f16679d3700 10 mds.0.server rdlock_path_pin_ref request(client.379194623:32785 nref=3 cr=0x563c4779bb80) #0x1003f0c6738/cgisess_024c841a02411969569b80c0672798c9
2024-04-04T17:50:56.825+0200 7f16679d3700 10 mds.0.locker acquire_locks request(client.379194623:32785 nref=3 cr=0x563c4779bb80)
2024-04-04T17:50:56.825+0200 7f16679d3700 20 mds.0.locker must rdlock (iauth excl) [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/cgisess_024c841a024
11969569b80c0672798c9 auth v4441767 ap=1 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1} caps={372831873=pAsxLsXsxFs
xcrwb/pAsxXsxFsxcrwb@289985},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=0 authpin=1 0x563ac163c100]
2024-04-04T17:50:56.825+0200 7f16679d3700 20 mds.0.locker must rdlock (ixattr excl) [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/cgisess_024c841a02
411969569b80c0672798c9 auth v4441767 ap=1 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1} caps={372831873=pAsxLsXsxF
sxcrwb/pAsxXsxFsxcrwb@289985},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=0 authpin=1 0x563ac163c100]
2024-04-04T17:50:56.825+0200 7f16679d3700 10 mds.0.locker must authpin [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/cgisess_024c841a02411969569b80c
0672798c9 auth v4441767 ap=1 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1} caps={372831873=pAsxLsXsxFsxcrwb/pAsxXs
xFsxcrwb@289985},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=0 authpin=1 0x563ac163c100]
2024-04-04T17:50:56.825+0200 7f16679d3700 10 mds.0.locker already auth_pinned [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/cgisess_024c841a02411969
569b80c0672798c9 auth v4441767 ap=1 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1} caps={372831873=pAsxLsXsxFsxcrwb
/pAsxXsxFsxcrwb@289985},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=0 authpin=1 0x563ac163c100]
2024-04-04T17:50:56.825+0200 7f16679d3700 7 mds.0.locker rdlock_start on (iauth excl) on [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/cgisess_024c
841a02411969569b80c0672798c9 auth v4441767 ap=1 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1} caps={372831873=pAsx
LsXsxFsxcrwb/pAsxXsxFsxcrwb@289985},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=0 authpin=1 0x563ac163c100]
2024-04-04T17:50:56.825+0200 7f16679d3700 7 mds.0.locker simple_sync on (iauth excl) on [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/cgisess_024c84
1a02411969569b80c0672798c9 auth v4441767 ap=1 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1} caps={372831873=pAsxLs
XsxFsxcrwb/pAsxXsxFsxcrwb@289985},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=0 authpin=1 0x563ac163c100]
2024-04-04T17:50:56.825+0200 7f16679d3700 7 mds.0.locker issue_caps loner client.372831873 allowed=pAsLsXsxFsxcrwb, xlocker allowed=pAsLsXsxFsxcrwb, others allowed=pLs on [inode 0x1012cb88597 [2,head] /volum
es/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/cgisess_024c841a02411969569b80c0672798c9 auth v4441767 ap=1 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (iauth excl->sync) (if
ile excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1} caps={372831873=pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@289985},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=0 auth
pin=1 0x563ac163c100]
2024-04-04T17:50:56.825+0200 7f16679d3700 20 mds.0.locker client.372831873 pending pAsxLsXsxFsxcrwb allowed pAsLsXsxFsxcrwb wanted pAsxXsxFsxcrwb
2024-04-04T17:50:56.825+0200 7f16679d3700 7 mds.0.locker sending MClientCaps to client.372831873 seq 289986 new pending pAsLsXsxFsxcrwb was pAsxLsXsxFsxcrwb
2024-04-04T17:50:56.825+0200 7f16679d3700 20 mds.0.cache.ino(0x1012cb88597) encode_cap_message pfile 1 pauth 0 plink 0 pxattr 0 ctime 2024-04-03T15:10:05.039419+0200
2024-04-04T17:50:56.825+0200 7f16679d3700 10 mds.0.27169 send_message_client_counted client.372831873 seq 738657 client_caps(revoke ino 0x1012cb88597 1 seq 289986 caps=pAsLsXsxFsxcrwb dirty=- wanted=pAsxXsxFs
xcrwb follows 0 size 177/4194304 ts 1/18446744073709551615 mtime 2024-04-03T15:10:05.039419+0200 tws 1) v11
Next this continues as a grant sequence
2024-04-04T17:50:56.825+0200 7f16679d3700 10 mds.0.cache.ino(0x1012cb88597) auth_pin by 0x563ac163c390 on [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/t
mp/cgisess_024c841a02411969569b80c0672798c9 auth v4441767 ap=2 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (iauth excl->sync) (ifile excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1}
caps={372831873=pAsLsXsxFsxcrwb/pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@289986},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=0 authpin=1 0x563ac163c100] now 2
2024-04-04T17:50:56.825+0200 7f16679d3700 15 mds.0.cache.dir(0x1003f0c6738) adjust_nested_auth_pins 1 on [dir 0x1003f0c6738 /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/ [2,head]
auth v=4446576 cv=4446576/4446576 ap=0+3 state=1074003969|complete f(v0 m2024-04-04T17:23:55.068160+0200 4310=4310+0) n(v42 rc2024-04-04T17:47:12.708372+0200 b1115214 4310=4310+0) hs=4310+0,ss=0+0 | child=1
dirty=0 waiter=0 authpin=0 0x563645696000] by 0x563ac163c100 count now 0/3
2024-04-04T17:50:56.825+0200 7f16679d3700 7 mds.0.locker rdlock_start waiting on (iauth excl->sync) on [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/cgisess_024c841a02411969569b80c0672798c9 auth v4441767 ap=2 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (iauth excl->sync) (ifile excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1} caps={372831873=pAsLsXsxFsxcrwb/pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@289986},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=0 authpin=1 0x563ac163c100]
2024-04-04T17:50:56.825+0200 7f16679d3700 10 mds.0.cache.ino(0x1012cb88597) add_waiter tag 2000000000040000 0x56367bc33160 !ambig 1 !frozen 1 !freezing 1
2024-04-04T17:50:56.825+0200 7f16679d3700 15 mds.0.cache.ino(0x1012cb88597) taking waiter here
2024-04-04T17:50:56.825+0200 7f16679d3700 10 mds.0.locker nudge_log (iauth excl->sync) on [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/cgisess_024c841a02411969569b80c0672798c9 auth v4441767 ap=2 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (iauth excl->sync) (ifile excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1} caps={372831873=pAsLsXsxFsxcrwb/pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@289986},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=1 authpin=1 0x563ac163c100]
2024-04-04T17:50:56.825+0200 7f16679d3700 7 mds.0.locker _do_cap_release for client.372831873 on [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/cgisess_024c841a02411969569b80c0672798c9 auth v4441767 ap=2 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (iauth excl->sync) (ifile excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1} caps={372831873=pAsLsXsxFsxcrwb/pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@289986},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=1 authpin=1 0x563ac163c100]
2024-04-04T17:50:56.825+0200 7f16679d3700 7 mds.0.locker issue_seq 145038 != 1
2024-04-04T17:50:56.825+0200 7f16679d3700 10 mds.0.locker eval_gather (iauth excl->sync) on [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/cgisess_024
=pAsLsXsxFsxcrwb/pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@289986},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=1 authpin=1 0x563ac163c100]
2024-04-04T17:50:56.825+0200 7f16679d3700 7 mds.0.locker _do_cap_release for client.372831873 on [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/cgise
ss_024c841a02411969569b80c0672798c9 auth v4441767 ap=2 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (iauth excl->sync) (ifile excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1} caps={3
72831873=pAsLsXsxFsxcrwb/pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@289986},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=1 authpin=1 0x563ac163c100]
2024-04-04T17:50:56.825+0200 7f16679d3700 7 mds.0.locker issue_seq 145038 != 1
2024-04-04T17:50:56.825+0200 7f16679d3700 10 mds.0.locker eval_gather (iauth excl->sync) on [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/cgisess_024
c841a02411969569b80c0672798c9 auth v4441767 ap=2 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (iauth excl->sync) (ifile excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1} caps={3728318
73=pAsLsXsxFsxcrwb/pAsxXsxFsxcrwb@289986},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=1 authpin=1 0x563ac163c100]
2024-04-04T17:50:56.825+0200 7f16679d3700 10 mds.0.locker next state is sync issued/allows loner s/s xlocker /s other /s
2024-04-04T17:50:56.825+0200 7f16679d3700 7 mds.0.locker eval_gather finished gather on (iauth excl->sync) on [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/work
ing/tmp/cgisess_024c841a02411969569b80c0672798c9 auth v4441767 ap=2 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (iauth excl->sync) (ifile excl) (ixattr excl) (iversion lock) cr={372831873=0-41943
04@1} caps={372831873=pAsLsXsxFsxcrwb/pAsxXsxFsxcrwb@289986},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=1 authpin=1 0x563ac163c100]
2024-04-04T17:50:56.825+0200 7f16679d3700 10 mds.0.cache.ino(0x1012cb88597) auth_unpin by 0x563ac163c390 on [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working
/tmp/cgisess_024c841a02411969569b80c0672798c9 auth v4441767 ap=1 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (ifile excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1} caps={372831873=
pAsLsXsxFsxcrwb/pAsxXsxFsxcrwb@289986},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=1 authpin=1 0x563ac163c100] now 1
2024-04-04T17:50:56.825+0200 7f16679d3700 15 mds.0.cache.dir(0x1003f0c6738) adjust_nested_auth_pins -1 on [dir 0x1003f0c6738 /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/ [2,head
] auth v=4446576 cv=4446576/4446576 ap=0+2 state=1074003969|complete f(v0 m2024-04-04T17:23:55.068160+0200 4310=4310+0) n(v42 rc2024-04-04T17:47:12.708372+0200 b1115214 4310=4310+0) hs=4310+0,ss=0+0 | child=1
dirty=0 waiter=0 authpin=0 0x563645696000] by 0x563ac163c390 count now 0/2
2024-04-04T17:50:56.825+0200 7f16679d3700 10 mds.0.locker simple_eval (iauth sync) on [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/cgisess_024c841a0
2411969569b80c0672798c9 auth v4441767 ap=1 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (ifile excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1} caps={372831873=pAsLsXsxFsxcrwb/pAsxXs
xFsxcrwb@289986},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=0 authpin=1 0x563ac163c100]
2024-04-04T17:50:56.825+0200 7f16679d3700 7 mds.0.locker simple_eval stable, going to excl (iauth sync) on [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working
/tmp/cgisess_024c841a02411969569b80c0672798c9 auth v4441767 ap=1 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (ifile excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1} caps={372831873=
pAsLsXsxFsxcrwb/pAsxXsxFsxcrwb@289986},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=0 authpin=1 0x563ac163c100]
2024-04-04T17:50:56.825+0200 7f16679d3700 7 mds.0.locker simple_excl on (iauth sync) on [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/cgisess_024c84
1a02411969569b80c0672798c9 auth v4441767 ap=1 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (ifile excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1} caps={372831873=pAsLsXsxFsxcrwb/pAs
xXsxFsxcrwb@289986},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=0 authpin=1 0x563ac163c100]
2024-04-04T17:50:56.825+0200 7f16679d3700 7 mds.0.locker issue_caps loner client.372831873 allowed=pAsxLsXsxFsxcrwb, xlocker allowed=pAsxLsXsxFsxcrwb, others allowed=pLs on [inode 0x1012cb88597 [2,head] /vol
umes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/cgisess_024c841a02411969569b80c0672798c9 auth v4441767 ap=1 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (iauth excl) (ifile
excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1} caps={372831873=pAsLsXsxFsxcrwb/pAsxXsxFsxcrwb@289986},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=0 authpin=1
0x563ac163c100]
2024-04-04T17:50:56.825+0200 7f16679d3700 20 mds.0.locker client.372831873 pending pAsLsXsxFsxcrwb allowed pAsxLsXsxFsxcrwb wanted pAsxXsxFsxcrwb
2024-04-04T17:50:56.825+0200 7f16679d3700 7 mds.0.locker sending MClientCaps to client.372831873 seq 289987 new pending pAsxLsXsxFsxcrwb was pAsLsXsxFsxcrwb
2024-04-04T17:50:56.825+0200 7f16679d3700 20 mds.0.cache.ino(0x1012cb88597) encode_cap_message pfile 1 pauth 0 plink 0 pxattr 0 ctime 2024-04-03T15:10:05.039419+0200
2024-04-04T17:50:56.825+0200 7f16679d3700 10 mds.0.27169 send_message_client_counted client.372831873 seq 738658 client_caps(grant ino 0x1012cb88597 1 seq 289987 caps=pAsxLsXsxFsxcrwb dirty=- wanted=pAsxXsxFs
xcrwb follows 0 size 177/4194304 ts 1/18446744073709551615 mtime 2024-04-03T15:10:05.039419+0200 tws 1) v11
2024-04-04T17:50:56.825+0200 7f16679d3700 10 MDSContext::complete: 18C_MDS_RetryRequest
The client in rw mode is running 6.5.5 kernel, which has https://github.com/ceph/ceph-client/commit/ce72d4e0f179340cece90d5b826eb63bbf9fefc0 tracker https://tracker.ceph.com/issues/61332
However the server doesn't have https://github.com/ceph/ceph/pull/51500 and also this tracker with mismatched sequence is not patched in the client https://tracker.ceph.com/issues/61782
Updated by Abhishek Lekshmanan almost 2 years ago
Posted more logs at fed9e44e-a0ec-4692-ae23-6a1047fe9247
Updated by Xiubo Li almost 2 years ago · Edited
The client.379194623:32785 lookup request was spinning infinitely in MDS:
2024-04-04T17:50:56.835+0200 7f16679d3700 7 mds.0.server dispatch_client_request client_request(client.379194623:32785 lookup #0x1003f0c6738/cgisess_024c841a02411969569b80c0672798c9 2024-04-04T17:49:38.165044+0200 caller_uid=0, caller_gid=0{0,}) v4
2024-04-04T17:50:56.835+0200 7f16679d3700 10 mds.0.server rdlock_path_pin_ref request(client.379194623:32785 nref=3 cr=0x563c4779bb80) #0x1003f0c6738/cgisess_024c841a02411969569b80c0672798c9
2024-04-04T17:50:56.835+0200 7f16679d3700 10 mds.0.locker acquire_locks request(client.379194623:32785 nref=3 cr=0x563c4779bb80)
2024-04-04T17:50:56.835+0200 7f16679d3700 20 mds.0.locker must rdlock (iauth excl) [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/cgisess_024c841a02411969569b80c0672798c9 auth v4441767 ap=1 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1} caps={372831873=pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@289995},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=0 authpin=1 0x563ac163c100]
2024-04-04T17:50:56.835+0200 7f16679d3700 20 mds.0.locker must rdlock (ixattr excl) [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/cgisess_024c841a02411969569b80c0672798c9 auth v4441767 ap=1 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1} caps={372831873=pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@289995},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=0 authpin=1 0x563ac163c100]
2024-04-04T17:50:56.835+0200 7f16679d3700 10 mds.0.locker must authpin [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/cgisess_024c841a02411969569b80c0672798c9 auth v4441767 ap=1 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1} caps={372831873=pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@289995},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=0 authpin=1 0x563ac163c100]
2024-04-04T17:50:56.835+0200 7f16679d3700 10 mds.0.locker already auth_pinned [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/cgisess_024c841a02411969569b80c0672798c9 auth v4441767 ap=1 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1} caps={372831873=pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@289995},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=0 authpin=1 0x563ac163c100]
2024-04-04T17:50:56.835+0200 7f16679d3700 7 mds.0.locker rdlock_start on (iauth excl) on [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/cgisess_024c841a02411969569b80c0672798c9 auth v4441767 ap=1 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1} caps={372831873=pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@289995},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=0 authpin=1 0x563ac163c100]
2024-04-04T17:50:56.835+0200 7f16679d3700 7 mds.0.locker simple_sync on (iauth excl) on [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/cgisess_024c841a02411969569b80c0672798c9 auth v4441767 ap=1 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1} caps={372831873=pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@289995},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=0 authpin=1 0x563ac163c100]
2024-04-04T17:50:56.835+0200 7f16679d3700 7 mds.0.locker issue_caps loner client.372831873 allowed=pAsLsXsxFsxcrwb, xlocker allowed=pAsLsXsxFsxcrwb, others allowed=pLs on [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/cgisess_024c841a02411969569b80c0672798c9 auth v4441767 ap=1 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (iauth excl->sync) (ifile excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1} caps={372831873=pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@289995},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=0 authpin=1 0x563ac163c100]
2024-04-04T17:50:56.835+0200 7f16679d3700 20 mds.0.locker client.372831873 pending pAsxLsXsxFsxcrwb allowed pAsLsXsxFsxcrwb wanted pAsxXsxFsxcrwb
2024-04-04T17:50:56.835+0200 7f16679d3700 7 mds.0.locker sending MClientCaps to client.372831873 seq 289996 new pending pAsLsXsxFsxcrwb was pAsxLsXsxFsxcrwb
2024-04-04T17:50:56.835+0200 7f16679d3700 10 mds.0.27169 send_message_client_counted client.372831873 seq 738667 client_caps(revoke ino 0x1012cb88597 1 seq 289996 caps=pAsLsXsxFsxcrwb dirty=- wanted=pAsxXsxFsxcrwb follows 0 size 177/4194304 ts 1/18446744073709551615 mtime 2024-04-03T15:10:05.039419+0200 tws 1) v11
2024-04-04T17:50:56.835+0200 7f16679d3700 10 mds.0.cache.ino(0x1012cb88597) auth_pin by 0x563ac163c390 on [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/cgisess_024c841a02411969569b80c0672798c9 auth v4441767 ap=2 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (iauth excl->sync) (ifile excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1} caps={372831873=pAsLsXsxFsxcrwb/pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@289996},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=0 authpin=1 0x563ac163c100] now 2
2024-04-04T17:50:56.835+0200 7f16679d3700 15 mds.0.cache.dir(0x1003f0c6738) adjust_nested_auth_pins 1 on [dir 0x1003f0c6738 /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/ [2,head] auth v=4446576 cv=4446576/4446576 ap=0+3 state=1074003969|complete f(v0 m2024-04-04T17:23:55.068160+0200 4310=4310+0) n(v42 rc2024-04-04T17:47:12.708372+0200 b1115214 4310=4310+0) hs=4310+0,ss=0+0 | child=1 dirty=0 waiter=0 authpin=0 0x563645696000] by 0x563ac163c100 count now 0/3
2024-04-04T17:50:56.835+0200 7f16679d3700 7 mds.0.locker rdlock_start waiting on (iauth excl->sync) on [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/cgisess_024c841a02411969569b80c0672798c9 auth v4441767 ap=2 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (iauth excl->sync) (ifile excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1} caps={372831873=pAsLsXsxFsxcrwb/pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@289996},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=0 authpin=1 0x563ac163c100]
2024-04-04T17:50:56.835+0200 7f16679d3700 10 mds.0.locker nudge_log (iauth excl->sync) on [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/cgisess_024c841a02411969569b80c0672798c9 auth v4441767 ap=2 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (iauth excl->sync) (ifile excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1} caps={372831873=pAsLsXsxFsxcrwb/pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@289996},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=1 authpin=1 0x563ac163c100]
When trying to acquire the rdlock for authlock, it will send a Ax caps revoke request to another client.372831873.
But later it received a cap release request from client.372831873, instead of receiving a revoke ack:
2024-04-04T17:50:56.842+0200 7f16679d3700 7 mds.0.locker _do_cap_release for client.372831873 on [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/cgisess_024c841a02411969569b80c0672798c9 auth v4441767 ap=2 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (iauth excl->sync) (ifile excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1} caps={372831873=pAsLsXsxFsxcrwb/pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@289996},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=1 authpin=1 0x563ac163c100]
2024-04-04T17:50:56.842+0200 7f16679d3700 10 mds.0.locker eval_gather (iauth excl->sync) on [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/cgisess_024c841a02411969569b80c0672798c9 auth v4441767 ap=2 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (iauth excl->sync) (ifile excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1} caps={372831873=pAsLsXsxFsxcrwb/pAsxXsxFsxcrwb@289996},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=1 authpin=1 0x563ac163c100]
2024-04-04T17:50:56.842+0200 7f16679d3700 7 mds.0.locker eval_gather finished gather on (iauth excl->sync) on [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/cgisess_024c841a02411969569b80c0672798c9 auth v4441767 ap=2 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (iauth excl->sync) (ifile excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1} caps={372831873=pAsLsXsxFsxcrwb/pAsxXsxFsxcrwb@289996},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=1 authpin=1 0x563ac163c100]
2024-04-04T17:50:56.842+0200 7f16679d3700 10 mds.0.cache.ino(0x1012cb88597) auth_unpin by 0x563ac163c390 on [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/cgisess_024c841a02411969569b80c0672798c9 auth v4441767 ap=1 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (ifile excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1} caps={372831873=pAsLsXsxFsxcrwb/pAsxXsxFsxcrwb@289996},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=1 authpin=1 0x563ac163c100] now 1
2024-04-04T17:50:56.842+0200 7f16679d3700 15 mds.0.cache.dir(0x1003f0c6738) adjust_nested_auth_pins -1 on [dir 0x1003f0c6738 /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/ [2,head] auth v=4446576 cv=4446576/4446576 ap=0+2 state=1074003969|complete f(v0 m2024-04-04T17:23:55.068160+0200 4310=4310+0) n(v42 rc2024-04-04T17:47:12.708372+0200 b1115214 4310=4310+0) hs=4310+0,ss=0+0 | child=1 dirty=0 waiter=0 authpin=0 0x563645696000] by 0x563ac163c390 count now 0/2
2024-04-04T17:50:56.842+0200 7f16679d3700 10 mds.0.locker simple_eval (iauth sync) on [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/cgisess_024c841a02411969569b80c0672798c9 auth v4441767 ap=1 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (ifile excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1} caps={372831873=pAsLsXsxFsxcrwb/pAsxXsxFsxcrwb@289996},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=0 authpin=1 0x563ac163c100]
2024-04-04T17:50:56.842+0200 7f16679d3700 7 mds.0.locker simple_eval stable, going to excl (iauth sync) on [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/cgisess_024c841a02411969569b80c0672798c9 auth v4441767 ap=1 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (ifile excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1} caps={372831873=pAsLsXsxFsxcrwb/pAsxXsxFsxcrwb@289996},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=0 authpin=1 0x563ac163c100]
2024-04-04T17:50:56.842+0200 7f16679d3700 7 mds.0.locker simple_excl on (iauth sync) on [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/cgisess_024c841a02411969569b80c0672798c9 auth v4441767 ap=1 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (ifile excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1} caps={372831873=pAsLsXsxFsxcrwb/pAsxXsxFsxcrwb@289996},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=0 authpin=1 0x563ac163c100]
2024-04-04T17:50:56.842+0200 7f16679d3700 7 mds.0.locker issue_caps loner client.372831873 allowed=pAsxLsXsxFsxcrwb, xlocker allowed=pAsxLsXsxFsxcrwb, others allowed=pLs on [inode 0x1012cb88597 [2,head] /volumes/_nogroup/d7e4525a-d914-4061-8fd7-be14d533d013/foswiki/working/tmp/cgisess_024c841a02411969569b80c0672798c9 auth v4441767 ap=1 s=177 n(v0 rc2024-04-03T15:10:05.039419+0200 b177 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={372831873=0-4194304@1} caps={372831873=pAsLsXsxFsxcrwb/pAsxXsxFsxcrwb@289996},l=372831873 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=0 authpin=1 0x563ac163c100]
2024-04-04T17:50:56.842+0200 7f16679d3700 20 mds.0.locker client.372831873 pending pAsLsXsxFsxcrwb allowed pAsxLsXsxFsxcrwb wanted pAsxXsxFsxcrwb
2024-04-04T17:50:56.842+0200 7f16679d3700 7 mds.0.locker sending MClientCaps to client.372831873 seq 289997 new pending pAsxLsXsxFsxcrwb was pAsLsXsxFsxcrwb
2024-04-04T17:50:56.842+0200 7f16679d3700 10 mds.0.27169 send_message_client_counted client.372831873 seq 738668 client_caps(grant ino 0x1012cb88597 1 seq 289997 caps=pAsxLsXsxFsxcrwb dirty=- wanted=pAsxXsxFsxcrwb follows 0 size 177/4194304 ts 1/18446744073709551615 mtime 2024-04-03T15:10:05.039419+0200 tws 1) v11
2024-04-04T17:50:56.842+0200 7f16679d3700 7 mds.0.server dispatch_client_request client_request(client.379194623:32785 lookup #0x1003f0c6738/cgisess_024c841a02411969569b80c0672798c9 2024-04-04T17:49:38.165044+0200 caller_uid=0, caller_gid=0{0,}) v4
This will happen when the corresponding cap has been removed by client.372831873. But in this case the MDS should remove the client cap and skipped issue new caps to the client.372831873 again, but not sure why the MDS didn't remove it. Then the MDS will issue the Ax back to the client.372831873. And then later when the client.379194623:32785 lookup request was woke up it will need to revoke the Ax caps again, then it was stuck in the dead loop infinitely.
The logs are not complete and a lot of the important logs are missed, so please upload the complete logs.
Thanks
Updated by Xiubo Li almost 2 years ago
- Status changed from Need More Info to Fix Under Review
- Pull request ID set to 56828
Updated by Abhishek Lekshmanan almost 2 years ago
We've uploaded a new set of logs with debug_ms 1 at 20d8ba67-8bb0-4cfc-a986-b72ec250728d
Updated by Enrico Bocchi almost 2 years ago
We have upgraded one of the two clusters to 16.2.15 (was on 16.2.9 before) and the problem remains.
Offending clients triggering the cap grant/revocation spin lock is always kernel, 6.5.5-200.
The manifestation has not changed from 16.2.9 to 16.2.15: Grant/revocation loop spinning infinitely, leading to fast memory growth on the MDS till OOM. Evicting the client breaks the loop and memory is instantly released.
Updated by Patrick Donnelly over 1 year ago
- Category set to Correctness/Safety
- Status changed from Fix Under Review to Pending Backport
- Target version set to v20.0.0
- Backport set to squid,reef,quincy
Updated by Upkeep Bot over 1 year ago
- Copied to Backport #66623: squid: mds spinlock due to lock contention leading to memory exaustion added
Updated by Upkeep Bot over 1 year ago
- Copied to Backport #66624: quincy: mds spinlock due to lock contention leading to memory exaustion added
Updated by Upkeep Bot over 1 year ago
- Copied to Backport #66625: reef: mds spinlock due to lock contention leading to memory exaustion added
Updated by Upkeep Bot over 1 year ago
- Tags (freeform) set to backport_processed
Updated by Patrick Donnelly over 1 year ago
- Related to Bug #57244: [WRN] : client.408214273 isn't responding to mclientcaps(revoke), ino 0x10000000003 pending pAsLsXsFs issued pAsLsXsFs, sent 62.303702 seconds ago added
Updated by Enrico Bocchi over 1 year ago
Following comment at [[https://github.com/ceph/ceph/pull/51500#discussion_r1655358492]] we have tracked the client versions (kernel + fuse) that carry the problematic change.
Kernel:
- Upstream: 6.5-rc1
- RHEL/ALmaLinux 9: 5.14.0-362.24
- RHEL/ALmaLinux 8: 4.18.0-513.24
Fuse:
- Pacific: 16.2.15
- Quincy: since 17.2.7
- Reef: since 18.2.1
- Squid: since 19.0.0
Updated by Xiubo Li over 1 year ago
Added some debug logs in kclient to find out what exactly has happened, it's so strange that the caps in kclient has been removed but still exists in MDS:
https://github.com/ceph/ceph-client/commit/aa12e19a03d9ad5b3020fd3e0aacbbc9f86b5b9d
Updated by Xiubo Li over 1 year ago · Edited
Abhishek Lekshmanan wrote in #note-13:
We've uploaded a new set of logs with debug_ms 1 at 20d8ba67-8bb0-4cfc-a986-b72ec250728d
I have found something new:
966367 masked_file_34192T21:08:00.366+0200 7f8f85d4b700 7 mds.0.locker issue_seq 1467769 != 1
966368 masked_file_34192T21:08:00.366+0200 7f8f85d4b700 10 mds.0.locker eval_gather (iauth excl->sync) on [inode 0x103c0abfaf4 [2,head] /volumes/_nogroup/6b9c1cfb-b9bc-49ad-b811-c906e0f4dc0d/masked_dir_97/masked_file_77 auth v18 ap=2 s=6 n(v0 rc2024-04-09T18:36:35.953676+0200 b6 1=1+0) (iauth excl->sync) (ifile excl) (ixattr excl) (iversion lock) cr={505841168=0-4194304@1} caps={505841168=pAsLsXsxFsxcrwb/pAsxXsxFxwb@2935533},l=505841168 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=1 authpin=1 0x5580d4743080]
...
968731 masked_file_1770T21:08:00.377+0200 7f8f85d4b700 7 mds.0.locker issue_seq 1467770 != 1
968732 masked_file_1770T21:08:00.377+0200 7f8f85d4b700 10 mds.0.locker eval_gather (iauth excl->sync) on [inode 0x103c0abfaf4 [2,head] /volumes/_nogroup/6b9c1cfb-b9bc-49ad-b811-c906e0f4dc0d/masked_dir_19/masked_file_20 auth v18 ap=2 s=6 n(v0 rc2024-04-09T18:36:35.953676+0200 b6 1=1+0) (iauth excl->sync) (ifile excl) (ixattr excl) (iversion lock) cr={505841168=0-4194304@1} caps={505841168=pAsLsXsxFsxcrwb/pAsxXsxFxwb@2935535},l=505841168 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=1 authpin=1 0x5580d4743080]
...
We can see the for the same inode 0x103c0abfaf4 the file path has changed: /volumes/_nogroup/6b9c1cfb-b9bc-49ad-b811-c906e0f4dc0d/masked_dir_97/masked_file_77 --> /volumes/_nogroup/6b9c1cfb-b9bc-49ad-b811-c906e0f4dc0d/masked_dir_19/masked_file_20.
And also changed frequently in the later logs:
11782645 masked_file_3166T21:08:59.946+0200 7f8f85d4b700 7 mds.0.locker issue_seq 1469751 != 1
11782646 masked_file_3166T21:08:59.946+0200 7f8f85d4b700 10 mds.0.locker eval_gather (iauth excl->sync) on [inode 0x103c0abfaf4 [2,head] /volumes/_nogroup/6b9c1cfb-b9bc-49ad-b811-c906e0f4dc0d/masked_dir_11/masked_file_3 auth v18 ap=2 s=6 n(v0 rc2024-04-09T18:36:35.953676+0200 b6 1=1+0) (iauth excl->sync) (ifile excl) (ixattr excl) (iversion lock) cr={505841168=0-4194304@1} caps={505841168=pAsLsXsxFsxcrwb/pAsxXsxFxwb@2939497},l=505841168 | ptrwaiter=0 request=1 lock=1 caps=1 dirtyparent=0 dirty=0 waiter=1 authpin=1 0x5580d4743080]
Updated by Abhishek Lekshmanan over 1 year ago
Xiubo Li wrote in #note-24:
Abhishek Lekshmanan wrote in #note-13:
We've uploaded a new set of logs with debug_ms 1 at 20d8ba67-8bb0-4cfc-a986-b72ec250728d
I have found something new:
[...]
We can see the for the same inode 0x103c0abfaf4 the file path has changed: /volumes/_nogroup/6b9c1cfb-b9bc-49ad-b811-c906e0f4dc0d/masked_dir_97/masked_file_77 --> /volumes/_nogroup/6b9c1cfb-b9bc-49ad-b811-c906e0f4dc0d/masked_dir_19/masked_file_20.
And also changed frequently in the later logs:
[...]
Hi Xiubo,
It looks like our masking tool had an error across file boundaries, I've uploaded the initial segment again at `f01995fd-bdb6-4fe9-b4c0-36bfd5daa517`, which has these issue_seq numbers, can you check that instead. Many thanks and sorry for the confusion
Updated by Xiubo Li over 1 year ago
Abhishek Lekshmanan wrote in #note-25:
Xiubo Li wrote in #note-24:
Abhishek Lekshmanan wrote in #note-13:
We've uploaded a new set of logs with debug_ms 1 at 20d8ba67-8bb0-4cfc-a986-b72ec250728d
I have found something new:
[...]
We can see the for the same inode 0x103c0abfaf4 the file path has changed: /volumes/_nogroup/6b9c1cfb-b9bc-49ad-b811-c906e0f4dc0d/masked_dir_97/masked_file_77 --> /volumes/_nogroup/6b9c1cfb-b9bc-49ad-b811-c906e0f4dc0d/masked_dir_19/masked_file_20.
And also changed frequently in the later logs:
[...]
Hi Xiubo,
It looks like our masking tool had an error across file boundaries, I've uploaded the initial segment again at `f01995fd-bdb6-4fe9-b4c0-36bfd5daa517`, which has these issue_seq numbers, can you check that instead. Many thanks and sorry for the confusion
Okay, that's so strange that the seq number are continued.
These logs still couldn't tell us why the caps was issued by MDS but the kclient didn't save it and then in kclient it couldn't find the inode or caps, this is the root cause of this issue.
Since you can reproduce it could you try to reproduce it with the mds debug logs enabled from beginning ? And if the log file is too huge then please truncated it periodically.
debug_mds = 20
debug_ms = 1
At the same time could you help test this workaround PR: https://github.com/ceph/ceph/pull/56828 and the debug kclient patch https://github.com/ceph/ceph-client/commit/aa12e19a03d9ad5b3020fd3e0aacbbc9f86b5b9d ?
Updated by Enrico Bocchi over 1 year ago
Hi Xiubo,
Unfortunately, we do not have a synthetic reproducer for this issue. We observe it happening in our biggest production CephFS cluster on a daily basis.
Afaict the uploaded logs were captured with debug_mds = 20, debug_ms = 1. You may want to check the logs that were uploaded yesterday (f01995fd-bdb6-4fe9-b4c0-36bfd5daa517): The inode 0x103c0abfaf4 always refers to the same file.
What do you mean with "from the beginning"? I think it will be practically impossible for us to run with debug_mds=20 + ms=1 for more than a few minutes :/
We haven't deployed your patch (https://github.com/ceph/ceph/pull/56828) on the affected cluster yet.
We do have deployed it on a smaller, test cluster (running 17.2.7 built with your patch on top) and it is doing ok. Still, we do not observe the issue on this cluster so we cannot tell if it fixes the issue or not.
Abhi has attempted to "invert" your patch by artificially make it such the 'remove_client_cap(in, cap);' is less likely to be executed. More about this soon.
I see Patrick has a stub tracker (https://tracker.ceph.com/issues/66704) as a follow-up of this issue.
Should we follow up there?
Updated by Xiubo Li over 1 year ago · Edited
Enrico Bocchi wrote in #note-27:
Hi Xiubo,
Unfortunately, we do not have a synthetic reproducer for this issue. We observe it happening in our biggest production CephFS cluster on a daily basis.
Afaict the uploaded logs were captured with debug_mds = 20, debug_ms = 1. You may want to check the logs that were uploaded yesterday (f01995fd-bdb6-4fe9-b4c0-36bfd5daa517): The inode 0x103c0abfaf4 always refers to the same file.
What do you mean with "from the beginning"? I think it will be practically impossible for us to run with debug_mds=20 + ms=1 for more than a few minutes :/
When this issue happened the debug log was not enabled and it seems only after for a while later when you see this issue happened then you enabled the debug logs.
What I want to know is what has happened just before the first issue is seen:
11782645 masked_file_3166T21:08:59.946+0200 7f8f85d4b700 7 mds.0.locker issue_seq 1469751 != 1
Just in the MDS side the Cap entity still exists but not in kclient side, it seems the kclient just deletes the caps but the corresponding caps is not successfully released in MDS side. I still haven't figured out how could this happen.
This is why I want to find the debug logs when we see the first above log.
I just suspect the kclient couldn't add the caps for some reason, or the MDS just drop the caps release request to ground.
We haven't deployed your patch (https://github.com/ceph/ceph/pull/56828) on the affected cluster yet.
We do have deployed it on a smaller, test cluster (running 17.2.7 built with your patch on top) and it is doing ok. Still, we do not observe the issue on this cluster so we cannot tell if it fixes the issue or not.
Abhi has attempted to "invert" your patch by artificially make it such the 'remove_client_cap(in, cap);' is less likely to be executed. More about this soon.I see Patrick has a stub tracker (https://tracker.ceph.com/issues/66704) as a follow-up of this issue.
Should we follow up there?
That's a different issue IMO.
Updated by Abhishek Lekshmanan over 1 year ago · Edited
Xiubo Li wrote in #note-28:
Enrico Bocchi wrote in #note-27:
Hi Xiubo,
Unfortunately, we do not have a synthetic reproducer for this issue. We observe it happening in our biggest production CephFS cluster on a daily basis.
Afaict the uploaded logs were captured with debug_mds = 20, debug_ms = 1. You may want to check the logs that were uploaded yesterday (f01995fd-bdb6-4fe9-b4c0-36bfd5daa517): The inode 0x103c0abfaf4 always refers to the same file.
What do you mean with "from the beginning"? I think it will be practically impossible for us to run with debug_mds=20 + ms=1 for more than a few minutes :/When this issue happened the debug log was not enabled and it seems only after for a while later when you see this issue happened then you enabled the debug logs.
What I want to know is what has happened just before the first issue is seen:
[...]
Just in the MDS side the Cap entity still exists but not in kclient side, it seems the kclient just deletes the caps but the corresponding caps is not successfully released in MDS side. I still haven't figured out how could this happen.
This is why I want to find the debug logs when we see the first above log.
I just suspect the kclient couldn't add the caps for some reason, or the MDS just drop the caps release request to ground.
Hi Xiubo, What scenarios can lead the `cap->get_last_issue()` to be 1? Usually the workloads that trigger this issue are when a mount with ro permissions does a find/stat and the writer node might be doing an equivalent of logrotate. The reader mount usually goes through different mount trees, ie. every time a backup workload completes it mounts a different share.
We did some tests on our side to simulate what causes the memory to grow, what we did was patch the MDS with an artificial trigger to understand the memory growth path. It seems like eval_cap_gather will lead to queing messages unbound in the messenger (probably unbound growth of the out_q ? ). Also what we notice is setting `debug_ms=1` usually slows down the memory growth enough that system might recover before we see us going oom.
The code we tried is simply triggering the error path randomly just to understand what causes the memory growth (Though even with this in place it might take a few hours with a read and write workload)
if (seq != cap->get_last_issue || trigger_counter % 50 == 0) { .. }
We haven't deployed your patch (https://github.com/ceph/ceph/pull/56828) on the affected cluster yet.
We do have deployed it on a smaller, test cluster (running 17.2.7 built with your patch on top) and it is doing ok. Still, we do not observe the issue on this cluster so we cannot tell if it fixes the issue or not.
Abhi has attempted to "invert" your patch by artificially make it such the 'remove_client_cap(in, cap);' is less likely to be executed. More about this soon.I see Patrick has a stub tracker (https://tracker.ceph.com/issues/66704) as a follow-up of this issue.
Should we follow up there?That's a different issue IMO.
Updated by Xiubo Li over 1 year ago
Abhishek Lekshmanan wrote in #note-29:
Xiubo Li wrote in #note-28:
Enrico Bocchi wrote in #note-27:
Hi Xiubo,
Unfortunately, we do not have a synthetic reproducer for this issue. We observe it happening in our biggest production CephFS cluster on a daily basis.
Afaict the uploaded logs were captured with debug_mds = 20, debug_ms = 1. You may want to check the logs that were uploaded yesterday (f01995fd-bdb6-4fe9-b4c0-36bfd5daa517): The inode 0x103c0abfaf4 always refers to the same file.
What do you mean with "from the beginning"? I think it will be practically impossible for us to run with debug_mds=20 + ms=1 for more than a few minutes :/When this issue happened the debug log was not enabled and it seems only after for a while later when you see this issue happened then you enabled the debug logs.
What I want to know is what has happened just before the first issue is seen:
[...]
Just in the MDS side the Cap entity still exists but not in kclient side, it seems the kclient just deletes the caps but the corresponding caps is not successfully released in MDS side. I still haven't figured out how could this happen.
This is why I want to find the debug logs when we see the first above log.
I just suspect the kclient couldn't add the caps for some reason, or the MDS just drop the caps release request to ground.
Hi Xiubo, What scenarios can lead the `cap->get_last_issue()` to be 1?
The last_issue seq will be set to the last_sent seq only when issuing the caps together with each client request reply msg or when importing.
But won't change when doing grant/revoke/export caps.
It was 1 usually means there was only one client request came from the writer client for this CInode, but it seemed in the kclient side this client request's reply msg was lost directly or for some reason the caps wasn't saved. This looks so strange to me. BTW, could you see any error or warning logs from the write kclient dmesg logs ?
BTW, you are using only one active MDS, which is the max_mds=1, right ? If so there shouldn't be any import/export happen between multiple active MDSs ?
Updated by Enrico Bocchi over 1 year ago
We have 4 active MDS (and 4 stand-by, not standby-replay).
But we use manual pinning (setfattr -n ceph.dir.pin -v <rank>) of subtrees, and we set `mds bal interval` to 0, so I exclude there is ongoing import/export among MDSs.
Updated by Venky Shankar over 1 year ago
- Related to Support #68958: Missing subvolumegroups and subvolumes after filesystem recovery added
Updated by Alex Markuze over 1 year ago
- Assignee changed from Xiubo Li to Alex Markuze
Updated by Adrien Georget about 1 year ago
Hi,
Any updates about this issue?
I see that this fix https://github.com/ceph/ceph/pull/58296 has been included in 17.2.8, anyone can confirm that it fixed the MDS spinlock issue?
Updated by Venky Shankar about 1 year ago
- Status changed from Pending Backport to Resolved
Adrien Georget wrote in #note-35:
Hi,
Any updates about this issue?
I see that this fix https://github.com/ceph/ceph/pull/58296 has been included in 17.2.8, anyone can confirm that it fixed the MDS spinlock issue?
This tracker should be marked as resolved since the fix has been merged into q/r/s releases.
Updated by Enrico Bocchi about 1 year ago
Last time I checked, the fix made it to:
- Quincy 17.2.8
- Reef branch, but not 18.2.4
- Squid 19.2.0
Updated by Upkeep Bot 9 months ago
- Merge Commit set to 0fed3f2066226a59803b04c682ea66d87b6fbfb7
- Fixed In set to v19.3.0-2915-g0fed3f20662
- Upkeep Timestamp set to 2025-07-02T03:31:33+00:00
Updated by Upkeep Bot 8 months ago
- Fixed In changed from v19.3.0-2915-g0fed3f20662 to v19.3.0-2915-g0fed3f2066
- Upkeep Timestamp changed from 2025-07-02T03:31:33+00:00 to 2025-07-14T16:45:17+00:00
Updated by Upkeep Bot 5 months ago
- Released In set to v20.2.0~2645
- Upkeep Timestamp changed from 2025-07-14T16:45:17+00:00 to 2025-11-01T01:26:54+00:00