mds: mark the scrub passed if dirfrag is dirty#57953
Conversation
|
Scheduled a QA run with fs:workload multi mds with dynamic balancer or ephemeral random pinning enabled. |
|
I still see the error after refreshing the PR with the fix for remote_dirfrag dirtiness. I will investigate further and update the PR |
@vshankar @leonid-s-usov This particular scrub failure is with the fix. I see that both directory inode and corresponding dirfrag is not dirty and still the rstat is mismatched. The test yml config is as below. So something else is going on. Need to figure out. fs:workload/{0-centos_9.stream begin/{0-install 1-cephadm 2-logrotate 3-modules} clusters/1a11s-mds-1c-client-3node conf/{client mds mgr mon osd} mount/kclient/{base/{mount-syntax/{v1} mount overrides/{distro/stock/{centos_9.stream k-stock} ms-die-on-skipped}} ms_mode/crc wsync/yes} objectstore-ec/bluestore-comp-ec-root omap_limit/10000 overrides/{cephsqlite-timeout frag ignorelist_health ignorelist_wrongly_marked_down osd-asserts pg_health session_timeout} ranks/multi/{balancer/random export-check n/3 replication/default} standby-replay tasks/{0-subvolume/{with-namespace-isolated} 1-check-counter 2-scrub/yes 3-snaps/yes 4-flush/yes 5-quiesce/with-quiesce 6-workunit/fs/misc}}
The scrub error is on "2024-06-13T09:00:29.607433+0000 mds.e (mds.2) 711 : cluster [WRN] Scrub error on inode 0x10000012f02 (/volumes/qa/sv_1/257f5e61-1d1a-4ee3-8dfe-78182c700535/client.0/tmp/payload.2/multiple_rsync_payload.228227/.build-id/17) see mds.e log and `damage ls` output for details" in cluster log Related mds.e.log (inode auth of the directory) . We can find that the inode is not dirty. ...
2024-06-13T09:00:29.602+0000 7f4b55282640 10 mds.2.scrubstack handle_scrub mds_scrub(queue_dir_ack 0x10000012f02 fragset_t(*) d0e8b015-afbe-4362-93f9-ce15cda58bb8) from mds.0
2024-06-13T09:00:29.602+0000 7f4b55282640 20 mds.2.scrubstack kick_off_scrubs: state=RUNNING
2024-06-13T09:00:29.602+0000 7f4b55282640 20 mds.2.scrubstack kick_off_scrubs entering with 4 in progress and 7992 in the stack
2024-06-13T09:00:29.602+0000 7f4b55282640 20 mds.2.scrubstack kick_off_scrubs examining [inode 0x10000012f02 [...2a,head] /volumes/qa/sv_1/257f5e61-1d1a-4ee3-8dfe-78182c700535/client.0/tmp/payload.2/multiple_rsync_payload.228227/.build-id/17/ auth{0=1} v952 RANDEPHEMERALPIN f(v1 m2024-06-13T08:55:10.381452+0000 8=8+0) n(v1 rc2024-06-13T08:55:16.501331+0000 b347 9=8+1) (inest mix) mcw={0=AsLsXsFs} | dirtyscattered=0 request=0 lock=0 importingcaps=0 dirfrag=1 caps=0 stickydirs=0 dirtyparent=0 dirwaiter=0 replicated=1 dirty=0 waiter=0 authpin=0 discoverbase=0 scrubqueue=1 randepin 0x5564d8188100]
2024-06-13T09:00:29.602+0000 7f4b55282640 10 mds.2.scrubstack scrub_dir_inode [inode 0x10000012f02 [...2a,head] /volumes/qa/sv_1/257f5e61-1d1a-4ee3-8dfe-78182c700535/client.0/tmp/payload.2/multiple_rsync_payload.228227/.build-id/17/ auth{0=1} v952 RANDEPHEMERALPIN f(v1 m2024-06-13T08:55:10.381452+0000 8=8+0) n(v1 rc2024-06-13T08:55:16.501331+0000 b347 9=8+1) (inest mix) mcw={0=AsLsXsFs} | dirtyscattered=0 request=0 lock=0 importingcaps=0 dirfrag=1 caps=0 stickydirs=0 dirtyparent=0 dirwaiter=0 replicated=1 dirty=0 waiter=0 authpin=0 discoverbase=0 scrubqueue=1 randepin 0x5564d8188100]
2024-06-13T09:00:29.602+0000 7f4b55282640 20 mds.2.scrubstack scrub_dir_inode recursive mode, frags [*]
2024-06-13T09:00:29.602+0000 7f4b55282640 20 mds.2.scrubstack scrub_dir_inode_final [inode 0x10000012f02 [...2a,head] /volumes/qa/sv_1/257f5e61-1d1a-4ee3-8dfe-78182c700535/client.0/tmp/payload.2/multiple_rsync_payload.228227/.build-id/17/ auth{0=1} v952 RANDEPHEMERALPIN f(v1 m2024-06-13T08:55:10.381452+0000 8=8+0) n(v1 rc2024-06-13T08:55:16.501331+0000 b347 9=8+1) (inest mix) mcw={0=AsLsXsFs} | dirtyscattered=0 request=0 lock=0 importingcaps=0 dirfrag=1 caps=0 stickydirs=0 dirtyparent=0 dirwaiter=0 replicated=1 dirty=0 waiter=0 authpin=0 discoverbase=0 scrubqueue=1 randepin 0x5564d8188100]
2024-06-13T09:00:29.602+0000 7f4b55282640 10 mds.2.cache.ino(0x10000012f02) scrub starting validate_disk_state on [inode 0x10000012f02 [...2a,head] /volumes/qa/sv_1/257f5e61-1d1a-4ee3-8dfe-78182c700535/client.0/tmp/payload.2/multiple_rsync_payload.228227/.build-id/17/ auth{0=1} v952 RANDEPHEMERALPIN f(v1 m2024-06-13T08:55:10.381452+0000 8=8+0) n(v1 rc2024-06-13T08:55:16.501331+0000 b347 9=8+1) (inest mix) mcw={0=AsLsXsFs} | dirtyscattered=0 request=0 lock=0 importingcaps=0 dirfrag=1 caps=0 stickydirs=0 dirtyparent=0 dirwaiter=0 replicated=1 dirty=0 waiter=0 authpin=0 discoverbase=0 scrubqueue=1 randepin 0x5564d8188100]
....
....
2024-06-13T09:00:29.602+0000 7f4b55282640 10 mds.2.scrubstack scrub_dir_inode done
2024-06-13T09:00:29.602+0000 7f4b55282640 20 mds.2.scrubstack kick_off_scrubs dir inode, done
2024-06-13T09:00:29.602+0000 7f4b55282640 20 mds.2.scrubstack dequeue [inode 0x10000012f02 [...2a,head] /volumes/qa/sv_1/257f5e61-1d1a-4ee3-8dfe-78182c700535/client.0/tmp/payload.2/multiple_rsync_payload.228227/.build-id/17/ auth{0=1} v952 ap=1 RANDEPHEMERALPIN f(v1 m2024-06-13T08:55:10.381452+0000 8=8+0) n(v1 rc2024-06-13T08:55:16.501331+0000 b347 9=8+1) (inest mix) mcw={0=AsLsXsFs} | dirtyscattered=0 request=0 lock=0 importingcaps=0 dirfrag=1 caps=0 stickydirs=0 dirtyparent=0 dirwaiter=0 replicated=1 dirty=0 waiter=0 authpin=1 discoverbase=0 scrubqueue=1 randepin 0x5564d8188100] from ScrubStack
...
...
2024-06-13T09:00:29.604+0000 7f4b4ea75640 20 mds.2.cache.ino(0x10000012f02) ondisk_read_retval: 0
2024-06-13T09:00:29.604+0000 7f4b4ea75640 10 mds.2.cache.ino(0x10000012f02) decoded 399 bytes of backtrace successfully
2024-06-13T09:00:29.604+0000 7f4b4ea75640 10 mds.2.cache.ino(0x10000012f02) scrub: inotable ino = 0x10000012f02
2024-06-13T09:00:29.604+0000 7f4b4ea75640 10 mds.2.cache.ino(0x10000012f02) scrub: inotable free says 0
2024-06-13T09:00:29.604+0000 7f4b4ea75640 7 mds.2.cache request_start_internal request(mds.2:94019 nref=2) op 5383
...
...
2024-06-13T09:00:29.604+0000 7f4b4ea75640 10 mds.2.cache rdlock_dirfrags_stats_work [inode 0x10000012f02 [...2a,head] /volumes/qa/sv_1/257f5e61-1d1a-4ee3-8dfe-78182c700535/client.0/tmp/payload.2/multiple_rsync_payload.228227/.build-id/17/ auth{0=1} v952 ap=2 RANDEPHEMERALPIN f(v1 m2024-06-13T08:55:10.381452+0000 8=8+0) n(v1 rc2024-06-13T08:55:16.501331+0000 b347 9=8+1) (inest mix) mcw={0=AsLsXsFs} | dirtyscattered=0 request=0 lock=0 importingcaps=0 dirfrag=1 caps=0 stickydirs=0 dirtyparent=0 dirwaiter=0 replicated=1 dirty=0 waiter=0 authpin=1 discoverbase=0 scrubqueue=0 randepin 0x5564d8188100]
....
....
2024-06-13T09:00:29.606+0000 7f4b55282640 7 mds.2.locker handle_file_lock a=syncack on (inest mix->sync g=0) from mds.0 [inode 0x10000012f02 [...2a,head] /volumes/qa/sv_1/257f5e61-1d1a-4ee3-8dfe-78182c700535/client.0/tmp/payload.2/multiple_rsync_payload.228227/.build-id/17/ auth{0=1} v952 ap=3 RANDEPHEMERALPIN f(v1 m2024-06-13T08:55:10.381452+0000 8=8+0) n(v1 rc2024-06-13T08:55:16.501331+0000 b347 9=8+1) (inest mix->sync g=0) (ifile sync r=1) mcw={0=AsLsXsFs} | dirtyscattered=0 request=0 lock=1 importingcaps=0 dirfrag=1 caps=0 stickydirs=0 dirtyparent=0 dirwaiter=0 replicated=1 dirty=0 waiter=1 authpin=1 discoverbase=0 scrubqueue=0 randepin 0x5564d8188100]
2024-06-13T09:00:29.606+0000 7f4b55282640 10 mds.2.cache.ino(0x10000012f02) decode_lock_inest * [2,head]
2024-06-13T09:00:29.606+0000 7f4b55282640 10 mds.2.cache.ino(0x10000012f02) decode_lock_inest * rstat n(v1 rc2024-06-13T08:55:10.383452+0000 b347 8=8+0)
2024-06-13T09:00:29.606+0000 7f4b55282640 10 mds.2.cache.ino(0x10000012f02) decode_lock_inest * accounted_rstat n(v1 rc2024-06-13T08:55:10.383452+0000 b347 8=8+0)
2024-06-13T09:00:29.606+0000 7f4b55282640 10 mds.2.cache.ino(0x10000012f02) decode_lock_inest * dirty_old_rstat {}
2024-06-13T09:00:29.606+0000 7f4b55282640 10 mds.2.cache.ino(0x10000012f02) * first 2 -> 2 on [dir 0x10000012f02 /volumes/qa/sv_1/257f5e61-1d1a-4ee3-8dfe-78182c700535/client.0/tmp/payload.2/multiple_rsync_payload.228227/.build-id/17/ [2,head] rep@0.1 dir_auth=0 state=0 f(v0 m2024-06-13T08:55:10.381452+0000 8=8+0)/f(v0 m2024-06-13T08:55:04.955560+0000 3=3+0) n(v1 rc2024-06-13T08:55:10.383452+0000 b347 8=8+0) hs=0+0,ss=0+0 | subtree=1 importbound=0 sticky=0 0x5564d789e900]
....
....
2024-06-13T09:00:29.606+0000 7f4b55282640 0 log_channel(cluster) log [WRN] : Scrub error on inode 0x10000012f02 (/volumes/qa/sv_1/257f5e61-1d1a-4ee3-8dfe-78182c700535/client.0/tmp/payload.2/multiple_rsync_payload.228227/.build-id/
17) see mds.e log and `damage ls` output for details
2024-06-13T09:00:29.606+0000 7f4b55282640 -1 mds.2.scrubstack _validate_inode_done scrub error on inode [inode 0x10000012f02 [...2a,head] /volumes/qa/sv_1/257f5e61-1d1a-4ee3-8dfe-78182c700535/client.0/tmp/payload.2/multiple_rsync_payload.228227/.build-id/17/ auth{0=1} v952 ap=2 RANDEPHEMERALPIN f(v1 m2024-06-13T08:55:10.381452+0000 8=8+0) n(v1 rc2024-06-13T08:55:16.501331+0000 b347 9=8+1) (inest sync->mix g=0) mcw={0=AsLsXsFs} | dirtyscattered=0 request=0 lock=0 importingcaps=0 dirfrag=1 caps=0 stickydirs=0 dirtyparent=0 dirwaiter=0 replicated=1 dirty=0 waiter=0 authpin=1 discoverbase=0 scrubqueue=0 randepin 0x5564d8188100]: {"performed_validation":true,"passed_validation":false,"backtrace":{"checked":true,"passed":true,"read_ret_val":0,"ondisk_value":"(2)0x10000012f02:[<0x10000012ec8/17 v952>,<0x10000012ec6/.build-id v6429>,<0x10000012ec5/multiple_rsync_payload.228227 v748>,<0x1000000c579/payload.2 v714>,<0x10000000005/tmp v1831>,<0x100000001fb/client.0 v1617>,<0x100000001fa/257f5e61-1d1a-4ee3-8dfe-78182c700535 v1572>,<0x10000000001/sv_1 v1540>,<0x10000000000/qa v1503>,<0x1/volumes v1450>]//[]","memoryvalue":"(2)0x10000012f02:[<0x10000012ec8/17 v952>,<0x10000012ec6/.build-id v6429>,<0x10000012ec5/multiple_rsync_payload.228227 v849>,<0x1000000c579/payload.2 v736>,<0x10000000005/tmp v1873>,<0x100000001fb/client.0 v1655>,<0x100000001fa/257f5e61-1d1a-4ee3-8dfe-78182c700535 v1612>,<0x10000000001/sv_1 v1576>,<0x10000000000/qa v1539>,<0x1/volumes v1486>]//[]","error_str":""},"raw_stats":{"checked":true,"passed":false,"read_ret_val":0,"ondisk_value.dirstat":"f(v0 m2024-06-13T08:55:04.955560+0000 3=3+0)","ondisk_value.rstat":"n(v0 rc2024-06-13T08:55:10.383452+0000 b347 9=8+1)","memory_value.dirstat":"f(v1 m2024-06-13T08:55:10.381452+0000 8=8+0)","memory_value.rstat":"n(v1 rc2024-06-13T08:55:16.501331+0000 b347 9=8+1)","error_str":"freshly-calculated rstats don't match existing ones"},"return_code":0}
2024-06-13T09:00:29.606+0000 7f4b55282640 20 mds.2.cache.ino(0x10000012f02) scrub_finished
.....
.....
Related mds.b.log (dirfrag auth of the directory). We can find that the dirfrag is not dirty from below logs. So 2024-06-13T09:00:29.601+0000 7f2aa23d8640 10 mds.0.scrubstack handle_scrub mds_scrub(queue_dir 0x10000012f02 fragset_t(*) d0e8b015-afbe-4362-93f9-ce15cda58bb8 force recursive) from mds.2
2024-06-13T09:00:29.601+0000 7f2aa23d8640 10 mds.0.scrubstack _enqueue with {[dir 0x10000012f02 /volumes/qa/sv_1/257f5e61-1d1a-4ee3-8dfe-78182c700535/client.0/tmp/payload.2/multiple_rsync_payload.228227/.build-id/17/ [2,head] auth{2=1} v=35 cv=35/35 dir_auth=0 state=1074266113|complete|auxsubtree f(v1 m2024-06-13T08:55:10.381452+0000 8=8+0) n(v1 rc2024-06-13T08:55:10.383452+0000 b347 8=8+0) hs=8+0,ss=0+0 | child=1 subtree=1 exportbound=0 replicated=1 dirty=0 authpin=0 scrubqueue=0 0x55884cd10d00]}, top=1
2024-06-13T09:00:29.601+0000 7f2aa23d8640 10 mds.0.cache.dir(0x10000012f02) auth_pin by 0x55883619f600 on [dir 0x10000012f02 /volumes/qa/sv_1/257f5e61-1d1a-4ee3-8dfe-78182c700535/client.0/tmp/payload.2/multiple_rsync_payload.228227/.build-id/17/ [2,head] auth{2=1} v=35 cv=35/35 dir_auth=0 ap=1+0 state=1074266113|complete|auxsubtree f(v1 m2024-06-13T08:55:10.381452+0000 8=8+0) n(v1 rc2024-06-13T08:55:10.383452+0000 b347 8=8+0) hs=8+0,ss=0+0 | child=1 subtree=1 exportbound=0 replicated=1 dirty=0 authpin=1 scrubqueue=0 0x55884cd10d00] count now 1
2024-06-13T09:00:29.601+0000 7f2aa23d8640 20 mds.0.scrubstack enqueue [dir 0x10000012f02 /volumes/qa/sv_1/257f5e61-1d1a-4ee3-8dfe-78182c700535/client.0/tmp/payload.2/multiple_rsync_payload.228227/.build-id/17/ [2,head] auth{2=1} v=35 cv=35/35 dir_auth=0 ap=1+0 state=1074266113|complete|auxsubtree f(v1 m2024-06-13T08:55:10.381452+0000 8=8+0) n(v1 rc2024-06-13T08:55:10.383452+0000 b347 8=8+0) hs=8+0,ss=0+0 | child=1 subtree=1 exportbound=0 replicated=1 dirty=0 authpin=1 scrubqueue=0 0x55884cd10d00] to top of ScrubStack
2024-06-13T09:00:29.601+0000 7f2aa23d8640 20 mds.0.scrubstack kick_off_scrubs: state=RUNNING
2024-06-13T09:00:29.601+0000 7f2aa23d8640 20 mds.0.scrubstack kick_off_scrubs entering with 5 in progress and 8306 in the stack
2024-06-13T09:00:29.601+0000 7f2aa23d8640 1 -- [v2:172.21.15.123:6838/2791669125,v1:172.21.15.123:6839/2791669125] send_to--> mds [v2:172.21.15.123:6834/2496736045,v1:172.21.15.123:6835/2496736045] -- mds_scrub(queue_dir_ack 0x10000012f02 fragset_t(*) d0e8b015-afbe-4362-93f9-ce15cda58bb8) -- ?+0 0x558844fcae00
2024-06-13T09:00:29.601+0000 7f2aa23d8640 1 -- [v2:172.21.15.123:6838/2791669125,v1:172.21.15.123:6839/2791669125] --> [v2:172.21.15.123:6834/2496736045,v1:172.21.15.123:6835/2496736045] -- mds_scrub(queue_dir_ack 0x10000012f02 fragset_t(*) d0e8b015-afbe-4362-93f9-ce15cda58bb8) -- 0x558844fcae00 con 0x558836451c00
2024-06-13T09:00:29.601+0000 7f2aa23d8640 1 -- [v2:172.21.15.123:6838/2791669125,v1:172.21.15.123:6839/2791669125] <== mds.1 v2:172.21.15.123:6836/3089364143 11418 ==== mds_scrub(queue_dir 0x2000000052c fragset_t(*) d0e8b015-afbe-4362-93f9-ce15cda58bb8 force recursive) ==== 72+0+0 (crc 0 0 0) 0x55883c410800 con 0x558836451800
|
leonid-s-usov
left a comment
There was a problem hiding this comment.
This is a valid approach, but there should be a way to avoid adding a new argument to so many methods. Please consider using the CInode::validated_data structure to store and access the information about the remote scrub ack.
@kotreshhr mentioned that this change doesn't fix the issue. Some explanation in https://tracker.ceph.com/issues/65020#note-34 |
I understand this differently - there is another issue that this change isn't supposed to fix. Kotresh states that in that new issue both the inode and the dirfrag are clean, so it's out of the scope of what the original problem was about. The fix will be different, too. |
|
This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days. |
|
@kotreshhr I remember this change fixing one part of the issue. Did we RCA the other part (incorrect rfiles)? |
There are two separate issues. I think we can take this and work on the other separately. |
ACK. |
|
@kotreshhr I think this PR needs to be updated as per comment https://tracker.ceph.com/issues/65020#note-36, yes? |
|
@rishabh-d-dave PTAL once @kotreshhr pushes an update. |
Sure. |
|
This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days. |
@kotreshhr Would it be possible to push an update this week? |
|
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
|
This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days. |
|
@kotreshhr please rebase. |
|
jenkins test make check arm64 |
@vshankar rebased and simplified the commit abf2800 Sample run with fs:workoads with scrub/yes - https://pulpito.ceph.com/khiremat-2025-05-16_05:33:04-fs:workload-wip-khiremat-57953-scrub-error-2-distro-default-smithi/ |
Grepping the teuthology logs did show the code is exercised. [khiremat@vossi04 khiremat-2025-05-16_05:33:04-fs:workload-wip-khiremat-57953-scrub-error-2-distro-default-smithi]$ find . | grep mds.*.log.gz | xargs zgrep "raw stats most likely wont match since" |
|
@rishabh-d-dave PTAL. |
|
Rebased and addressed an issue - the scrubber was not using the dirfrag dirty flag sent by scrub msg in the ack. |
ddafd5c to
c665b5e
Compare
The in-memory and on-disk stats might not match on a directory inode if any of the local or remote dirfrag is dirty. So don't fail the scrub and mark it as passed if the local or remote dirfrag is dirty. Signed-off-by: Kotresh HR <khiremat@redhat.com> Fixes: https://tracker.ceph.com/issues/65020
|
jenkins make test arm64 |
|
This PR is under test in https://tracker.ceph.com/issues/72073. |
Leonid is no longer working on Ceph project
|
jenkins test make check arm64 |
The in-memory and on-disk stats might not match on a directory inode if any of the dirfrag is dirty.
So don't fail the scrub and mark it as passed if
the dirfrag is dirty.
Fixes: https://tracker.ceph.com/issues/65020
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
xbetween the brackets:[x]. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windowsjenkins test rook e2e