Actions
Bug #75141
closedreef: mds: nudge_log never nudges the log
% Done:
0%
Source:
Q/A
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Tags (freeform):
Merge Commit:
Fixed In:
v18.2.7-1055-gab47f43c09
Released In:
v18.2.8~2
Upkeep Timestamp:
2026-03-18T01:23:55+00:00
Description
With a simple workflow like:
mkdir a; touch a/FILE; while sleep 1; do echo hi; mv -T a/FILE a/D; rm a/D; rmdir a; mkdir a; touch a/FILE; done
The loop completes about every 6 seconds (mds tick interval). The nudge_log calls that should push completion of unlinks/renames never functions:
2026-02-24T13:55:14.161-0500 7ccbebce5640 10 mds.0.locker acquire_locks request(client.4340:28 nref=4 cr=0x58eb98e69200)
2026-02-24T13:55:14.161-0500 7ccbebce5640 20 mds.0.locker must xlock (ilink sync) [inode 0x100000001fd [...2,head] /foo/a/ auth v70 ap=2 DIRTYPARENT f(v0 m2026-02-24T13:55:08.501155-0500) n(v0 rc2026-02-24T13:55:08.501155-0500 1=0+1) (isnap xlock x=1 by 0x58eb98e86d80) (inest lock) (ifile excl) (iversion lock) caps={4340=pAsXsFs/AsLsXsFsx@6},l=4340 | request=1 lock=1 dirfrag=1 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x58eb98db7180]
2026-02-24T13:55:14.161-0500 7ccbebce5640 20 mds.0.locker must wrlock (iversion lock) [inode 0x100000001fd [...2,head] /foo/a/ auth v70 ap=2 DIRTYPARENT f(v0 m2026-02-24T13:55:08.501155-0500) n(v0 rc2026-02-24T13:55:08.501155-0500 1=0+1) (isnap xlock x=1 by 0x58eb98e86d80) (inest lock) (ifile excl) (iversion lock) caps={4340=pAsXsFs/AsLsXsFsx@6},l=4340 | request=1 lock=1 dirfrag=1 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x58eb98db7180]
2026-02-24T13:55:14.161-0500 7ccbebce5640 20 mds.0.locker must xlock (isnap xlock x=1 by 0x58eb98e86d80) [inode 0x100000001fd [...2,head] /foo/a/ auth v70 ap=2 DIRTYPARENT f(v0 m2026-02-24T13:55:08.501155-0500) n(v0 rc2026-02-24T13:55:08.501155-0500 1=0+1) (isnap xlock x=1 by 0x58eb98e86d80) (inest lock) (ifile excl) (iversion lock) caps={4340=pAsXsFs/AsLsXsFsx@6},l=4340 | request=1 lock=1 dirfrag=1 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x58eb98db7180]
2026-02-24T13:55:14.161-0500 7ccbebce5640 20 mds.0.locker must wrlock (iversion lock) [inode 0x100000001fd [...2,head] /foo/a/ auth v70 ap=2 DIRTYPARENT f(v0 m2026-02-24T13:55:08.501155-0500) n(v0 rc2026-02-24T13:55:08.501155-0500 1=0+1) (isnap xlock x=1 by 0x58eb98e86d80) (inest lock) (ifile excl) (iversion lock) caps={4340=pAsXsFs/AsLsXsFsx@6},l=4340 | request=1 lock=1 dirfrag=1 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x58eb98db7180]
2026-02-24T13:55:14.161-0500 7ccbebce5640 20 mds.0.locker must rdlock (ifile excl) [inode 0x100000001fd [...2,head] /foo/a/ auth v70 ap=2 DIRTYPARENT f(v0 m2026-02-24T13:55:08.501155-0500) n(v0 rc2026-02-24T13:55:08.501155-0500 1=0+1) (isnap xlock x=1 by 0x58eb98e86d80) (inest lock) (ifile excl) (iversion lock) caps={4340=pAsXsFs/AsLsXsFsx@6},l=4340 | request=1 lock=1 dirfrag=1 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x58eb98db7180]
2026-02-24T13:55:14.161-0500 7ccbebce5640 20 mds.0.locker must wrlock (ifile sync) [inode 0x601 [...2,head] ~mds0/stray1/ auth v20 ap=1 DIRTYPARENT f(v0 m2026-02-24T13:55:08.501155-0500 2=1+1) n(v0 rc2026-02-24T13:55:08.501155-0500 3=1+2) (inest lock) (iversion lock) | lock=0 dirfrag=1 stickydirs=1 stray=1 dirtyrstat=0 dirtyparent=1 dirty=1 authpin=1 0x58eb98d32580]
2026-02-24T13:55:14.161-0500 7ccbebce5640 20 mds.0.locker must wrlock (inest lock) [inode 0x601 [...2,head] ~mds0/stray1/ auth v20 ap=1 DIRTYPARENT f(v0 m2026-02-24T13:55:08.501155-0500 2=1+1) n(v0 rc2026-02-24T13:55:08.501155-0500 3=1+2) (inest lock) (iversion lock) | lock=0 dirfrag=1 stickydirs=1 stray=1 dirtyrstat=0 dirtyparent=1 dirty=1 authpin=1 0x58eb98d32580]
2026-02-24T13:55:14.161-0500 7ccbebce5640 20 mds.0.locker must xlock (dn sync) [dentry #0x100/stray1/100000001fd [2,head] auth NULL (dversion lock) pv=0 v=61 ap=1 ino=(nil) state=1342177281|new | request=1 authpin=1 0x58eb98db2500]
2026-02-24T13:55:14.161-0500 7ccbebce5640 20 mds.0.locker must wrlock (dversion lock) [dentry #0x100/stray1/100000001fd [2,head] auth NULL (dversion lock) pv=0 v=61 ap=1 ino=(nil) state=1342177281|new | request=1 authpin=1 0x58eb98db2500]
2026-02-24T13:55:14.161-0500 7ccbebce5640 10 mds.0.locker must authpin [inode 0x601 [...2,head] ~mds0/stray1/ auth v20 ap=1 DIRTYPARENT f(v0 m2026-02-24T13:55:08.501155-0500 2=1+1) n(v0 rc2026-02-24T13:55:08.501155-0500 3=1+2) (inest lock) (iversion lock) | lock=0 dirfrag=1 stickydirs=1 stray=1 dirtyrstat=0 dirtyparent=1 dirty=1 authpin=1 0x58eb98d32580]
2026-02-24T13:55:14.161-0500 7ccbebce5640 10 mds.0.locker must authpin [dentry #0x100/stray1/100000001fd [2,head] auth NULL (dversion lock) pv=0 v=61 ap=1 ino=(nil) state=1342177281|new | request=1 authpin=1 0x58eb98db2500]
2026-02-24T13:55:14.161-0500 7ccbebce5640 10 mds.0.locker must authpin [inode 0x100000001fd [...2,head] /foo/a/ auth v70 ap=2 DIRTYPARENT f(v0 m2026-02-24T13:55:08.501155-0500) n(v0 rc2026-02-24T13:55:08.501155-0500 1=0+1) (isnap xlock x=1 by 0x58eb98e86d80) (inest lock) (ifile excl) (iversion lock) caps={4340=pAsXsFs/AsLsXsFsx@6},l=4340 | request=1 lock=1 dirfrag=1 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x58eb98db7180]
2026-02-24T13:55:14.161-0500 7ccbebce5640 10 mds.0.locker already auth_pinned [inode 0x601 [...2,head] ~mds0/stray1/ auth v20 ap=1 DIRTYPARENT f(v0 m2026-02-24T13:55:08.501155-0500 2=1+1) n(v0 rc2026-02-24T13:55:08.501155-0500 3=1+2) (inest lock) (iversion lock) | lock=0 dirfrag=1 stickydirs=1 stray=1 dirtyrstat=0 dirtyparent=1 dirty=1 authpin=1 0x58eb98d32580]
2026-02-24T13:55:14.161-0500 7ccbebce5640 10 mds.0.locker already auth_pinned [dentry #0x100/stray1/100000001fd [2,head] auth NULL (dversion lock) pv=0 v=61 ap=1 ino=(nil) state=1342177281|new | request=1 authpin=1 0x58eb98db2500]
2026-02-24T13:55:14.161-0500 7ccbebce5640 10 mds.0.locker already auth_pinned [inode 0x100000001fd [...2,head] /foo/a/ auth v70 ap=2 DIRTYPARENT f(v0 m2026-02-24T13:55:08.501155-0500) n(v0 rc2026-02-24T13:55:08.501155-0500 1=0+1) (isnap xlock x=1 by 0x58eb98e86d80) (inest lock) (ifile excl) (iversion lock) caps={4340=pAsXsFs/AsLsXsFsx@6},l=4340 | request=1 lock=1 dirfrag=1 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x58eb98db7180]
2026-02-24T13:55:14.161-0500 7ccbebce5640 10 mds.0.locker already xlocked (isnap xlock x=1 by 0x58eb98e86d80) [inode 0x100000001fd [...2,head] /foo/a/ auth v70 ap=2 DIRTYPARENT f(v0 m2026-02-24T13:55:08.501155-0500) n(v0 rc2026-02-24T13:55:08.501155-0500 1=0+1) (isnap xlock x=1 by 0x58eb98e86d80) (inest lock) (ifile excl) (iversion lock) caps={4340=pAsXsFs/AsLsXsFsx@6},l=4340 | request=1 lock=1 dirfrag=1 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x58eb98db7180]
2026-02-24T13:55:14.161-0500 7ccbebce5640 7 mds.0.locker rdlock_start on (ifile excl) on [inode 0x100000001fd [...2,head] /foo/a/ auth v70 ap=2 DIRTYPARENT f(v0 m2026-02-24T13:55:08.501155-0500) n(v0 rc2026-02-24T13:55:08.501155-0500 1=0+1) (isnap xlock x=1 by 0x58eb98e86d80) (inest lock) (ifile excl) (iversion lock) caps={4340=pAsXsFs/AsLsXsFsx@6},l=4340 | request=1 lock=1 dirfrag=1 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x58eb98db7180]
2026-02-24T13:55:14.161-0500 7ccbebce5640 7 mds.0.locker simple_sync on (ifile excl) on [inode 0x100000001fd [...2,head] /foo/a/ auth v70 ap=2 DIRTYPARENT f(v0 m2026-02-24T13:55:08.501155-0500) n(v0 rc2026-02-24T13:55:08.501155-0500 1=0+1) (isnap xlock x=1 by 0x58eb98e86d80) (inest lock) (ifile excl) (iversion lock) caps={4340=pAsXsFs/AsLsXsFsx@6},l=4340 | request=1 lock=1 dirfrag=1 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x58eb98db7180]
2026-02-24T13:55:14.161-0500 7ccbebce5640 7 mds.0.locker get_allowed_caps loner client.4340 allowed=pAsLsXsFscrl, xlocker allowed=pAsLsXsFscrl, others allowed=pAsLsXsFscrl on [inode 0x100000001fd [...2,head] /foo/a/ auth v70 ap=2 DIRTYPARENT f(v0 m2026-02-24T13:55:08.501155-0500) n(v0 rc2026-02-24T13:55:08.501155-0500 1=0+1) (isnap xlock x=1 by 0x58eb98e86d80) (inest lock) (iversion lock) caps={4340=pAsXsFs/AsLsXsFsx@6},l=4340 | request=1 lock=1 dirfrag=1 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x58eb98db7180]
2026-02-24T13:55:14.161-0500 7ccbebce5640 20 mds.0.locker client.4340 pending pAsXsFs allowed pAsLsXsFs wanted AsLsXsFsx
2026-02-24T13:55:14.161-0500 7ccbebce5640 7 mds.0.locker sending MClientCaps to client.4340 seq 7 new pending pAsLsXsFs was pAsXsFs
2026-02-24T13:55:14.161-0500 7ccbebce5640 20 mds.0.cache.ino(0x100000001fd) encode_cap_message pfile 0 pauth 0 plink 0 pxattr 0 mtime 2026-02-24T13:55:08.501155-0500 ctime 2026-02-24T13:55:08.501155-0500 change_attr 4
2026-02-24T13:55:14.161-0500 7ccbebce5640 10 mds.0.4 send_message_client_counted client.4340 seq 44 client_caps(grant ino 0x100000001fd 1 seq 7 caps=pAsLsXsFs dirty=- wanted=AsLsXsFsx follows 0 size 0/0 ts 1/18446744073709551615 mtime 2026-02-24T13:55:08.501155-0500 ctime 2026-02-24T13:55:08.501155-0500 change_attr 4) v12
2026-02-24T13:55:14.161-0500 7ccbebce5640 1 -- [v2:192.168.0.124:6868/1951124532,v1:192.168.0.124:6869/1951124532] --> 192.168.0.124:0/930444256 -- client_caps(grant ino 0x100000001fd 1 seq 7 caps=pAsLsXsFs dirty=- wanted=AsLsXsFsx follows 0 size 0/0 ts 1/18446744073709551615 mtime 2026-02-24T13:55:08.501155-0500 ctime 2026-02-24T13:55:08.501155-0500 change_attr 4) v12 -- 0x58eb98e98700 con 0x58eb98d69b00
2026-02-24T13:55:14.161-0500 7ccbebce5640 10 mds.0.locker got rdlock on (ifile sync r=1) [inode 0x100000001fd [...2,head] /foo/a/ auth v70 ap=2 DIRTYPARENT f(v0 m2026-02-24T13:55:08.501155-0500) n(v0 rc2026-02-24T13:55:08.501155-0500 1=0+1) (isnap xlock x=1 by 0x58eb98e86d80) (inest lock) (ifile sync r=1) (iversion lock) caps={4340=pAsLsXsFs/AsLsXsFsx@7},l=4340 | request=1 lock=2 dirfrag=1 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x58eb98db7180]
2026-02-24T13:55:14.161-0500 7ccbebce5640 7 mds.0.locker xlock_start on (ilink sync) on [inode 0x100000001fd [...2,head] /foo/a/ auth v70 ap=2 DIRTYPARENT f(v0 m2026-02-24T13:55:08.501155-0500) n(v0 rc2026-02-24T13:55:08.501155-0500 1=0+1) (isnap xlock x=1 by 0x58eb98e86d80) (inest lock) (ifile sync r=1) (iversion lock) caps={4340=pAsLsXsFs/AsLsXsFsx@7},l=4340 | request=1 lock=2 dirfrag=1 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x58eb98db7180]
2026-02-24T13:55:14.161-0500 7ccbebce5640 7 mds.0.locker simple_lock on (ilink sync) on [inode 0x100000001fd [...2,head] /foo/a/ auth v70 ap=2 DIRTYPARENT f(v0 m2026-02-24T13:55:08.501155-0500) n(v0 rc2026-02-24T13:55:08.501155-0500 1=0+1) (isnap xlock x=1 by 0x58eb98e86d80) (inest lock) (ifile sync r=1) (iversion lock) caps={4340=pAsLsXsFs/AsLsXsFsx@7},l=4340 | request=1 lock=2 dirfrag=1 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x58eb98db7180]
2026-02-24T13:55:14.161-0500 7ccbebce5640 7 mds.0.locker get_allowed_caps loner client.4340 allowed=pAsXsFscrl, xlocker allowed=pAsXsFscrl, others allowed=pAsXsFscrl on [inode 0x100000001fd [...2,head] /foo/a/ auth v70 ap=2 DIRTYPARENT f(v0 m2026-02-24T13:55:08.501155-0500) n(v0 rc2026-02-24T13:55:08.501155-0500 1=0+1) (ilink sync->lock) (isnap xlock x=1 by 0x58eb98e86d80) (inest lock) (ifile sync r=1) (iversion lock) caps={4340=pAsLsXsFs/AsLsXsFsx@7},l=4340 | request=1 lock=2 dirfrag=1 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x58eb98db7180]
2026-02-24T13:55:14.161-0500 7ccbebce5640 20 mds.0.locker client.4340 pending pAsLsXsFs allowed pAsXsFs wanted AsLsXsFsx
2026-02-24T13:55:14.161-0500 7ccbebce5640 7 mds.0.locker sending MClientCaps to client.4340 seq 8 new pending pAsXsFs was pAsLsXsFs
2026-02-24T13:55:14.161-0500 7ccbebce5640 20 mds.0.cache.ino(0x100000001fd) encode_cap_message pfile 0 pauth 0 plink 0 pxattr 0 mtime 2026-02-24T13:55:08.501155-0500 ctime 2026-02-24T13:55:08.501155-0500 change_attr 4
2026-02-24T13:55:14.161-0500 7ccbebce5640 10 mds.0.4 send_message_client_counted client.4340 seq 45 client_caps(revoke ino 0x100000001fd 1 seq 8 caps=pAsXsFs dirty=- wanted=AsLsXsFsx follows 0 size 0/0 ts 1/18446744073709551615 mtime 2026-02-24T13:55:08.501155-0500 ctime 2026-02-24T13:55:08.501155-0500 change_attr 4) v12
2026-02-24T13:55:14.161-0500 7ccbebce5640 1 -- [v2:192.168.0.124:6868/1951124532,v1:192.168.0.124:6869/1951124532] --> 192.168.0.124:0/930444256 -- client_caps(revoke ino 0x100000001fd 1 seq 8 caps=pAsXsFs dirty=- wanted=AsLsXsFsx follows 0 size 0/0 ts 1/18446744073709551615 mtime 2026-02-24T13:55:08.501155-0500 ctime 2026-02-24T13:55:08.501155-0500 change_attr 4) v12 -- 0x58eb98e73180 con 0x58eb98d69b00
2026-02-24T13:55:14.161-0500 7ccbebce5640 10 mds.0.cache.ino(0x100000001fd) auth_pin by 0x58eb98db7448 on [inode 0x100000001fd [...2,head] /foo/a/ auth v70 ap=3 DIRTYPARENT f(v0 m2026-02-24T13:55:08.501155-0500) n(v0 rc2026-02-24T13:55:08.501155-0500 1=0+1) (ilink sync->lock) (isnap xlock x=1 by 0x58eb98e86d80) (inest lock) (ifile sync r=1) (iversion lock) caps={4340=pAsXsFs/pAsLsXsFs/AsLsXsFsx@8},l=4340 | request=1 lock=2 dirfrag=1 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x58eb98db7180] now 3
2026-02-24T13:55:14.161-0500 7ccbebce5640 15 mds.0.cache.dir(0x10000000000) adjust_nested_auth_pins 1 on [dir 0x10000000000 /foo/ [2,head] auth v=71 cv=0/0 ap=0+5 state=1610874881|complete f(v0 m2026-02-24T13:55:02.448182-0500 1=0+1) n(v2 rc2026-02-24T13:55:08.501155-0500 1=0+1)/n(v2 rc2026-02-24T13:55:08.492580-0500 2=1+1) hs=1+0,ss=0+0 dirty=1 | child=1 dirty=1 authpin=0 0x58eb98d69f80] by 0x58eb98db7180 count now 0/5
2026-02-24T13:55:14.161-0500 7ccbebce5640 10 mds.0.cache.ino(0x100000001fd) add_waiter tag 2000000000600000 0x58eb98de87c0 !ambig 1 !frozen 1 !freezing 1
2026-02-24T13:55:14.161-0500 7ccbebce5640 15 mds.0.cache.ino(0x100000001fd) taking waiter here
2026-02-24T13:55:14.161-0500 7ccbebce5640 20 mds.0.locker nudge_log NO (ilink sync->lock) on [inode 0x100000001fd [...2,head] /foo/a/ auth v70 ap=3 DIRTYPARENT f(v0 m2026-02-24T13:55:08.501155-0500) n(v0 rc2026-02-24T13:55:08.501155-0500 1=0+1) (ilink sync->lock) (isnap xlock x=1 by 0x58eb98e86d80) (inest lock) (ifile sync r=1) (iversion lock) caps={4340=pAsXsFs/pAsLsXsFs/AsLsXsFsx@8},l=4340 | request=1 lock=2 dirfrag=1 caps=1 dirtyparent=1 dirty=1 waiter=1 authpin=1 0x58eb98db7180]
This bug essentially causes many operations operating on the same metadata to serialize with infrequent journal flushes on the MDS.
The cause is a bug in the backport
https://github.com/ceph/ceph/pull/64540
specifically the new SimpleLock::has_any_waiter method is incorrect for the older versions of wait bit sets (using a uint64_t).
This bug is causing delays in powercycle and fs suite in v18.2.8 QA (#73906). It also affects Squid/Tentacle which upgrade from reef.
Updated by Patrick Donnelly 24 days ago
- Related to Backport #72162: reef: mds: nudge log for unstable locks after early reply added
Updated by Patrick Donnelly 24 days ago
- Related to Bug #75143: mds: incorrect wait mask for SimpleLock::has_any_waiter added
Updated by Patrick Donnelly 24 days ago
- Status changed from New to Fix Under Review
Updated by Patrick Donnelly 4 days ago
- Status changed from Fix Under Review to Resolved
- Target version set to v18.2.8
Updated by Upkeep Bot 3 days ago
- Merge Commit set to ab47f43c099b2cbae6e21342fe673ce251da54d6
- Fixed In set to v18.2.7-1055-gab47f43c09
- Released In set to v18.2.8~2
- Upkeep Timestamp set to 2026-03-18T01:23:55+00:00
Actions