Bug #69624
closedlibcephfs: ceph_ll_lookup() API crashes in choose_target_mds()
0%
Description
We started observing failures in running specific smbtorture test cases as described in https://github.com/samba-in-kubernetes/sit-test-cases/issues/96#issuecomment-2597771310. In every run we identified a libcephfs crash while processing ceph_ll_lookup() API. With the help of gdb we could come up with a standalone reproducer without involving Samba as attached.
# gcc -D_FILE_OFFSET_BITS=64 -lcephfs libcephfs_ll_lookup_crash.c -o lookup
# ./lookup
Usage: ./lookup <client-id> <filesystem>
# ./lookup foo cephvol
root inode = 1
first lookup at / completed
Segmentation fault (core dumped)
# coredumpctl gdb 92491
. . .
Reading symbols from /root/lookup...
(No debugging symbols found in /root/lookup)
[New LWP 92491]
[New LWP 92510]
[New LWP 92492]
[New LWP 92502]
[New LWP 92494]
[New LWP 92503]
[New LWP 92505]
[New LWP 92506]
[New LWP 92504]
[New LWP 92507]
[New LWP 92508]
[New LWP 92509]
[New LWP 92493]
[New LWP 92495]
[New LWP 92496]
[New LWP 92497]
[New LWP 92501]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `./lookup foo cephvol'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 snapid_t::operator unsigned long (this=0x30) at /usr/src/debug/ceph-19.2.0-3.fc41.x86_64/src/include/object.h:118
118 operator uint64_t() const { return val; }
[Current thread is 1 (Thread 0x7f3fafe15780 (LWP 92491))]
(gdb) bt
#0 snapid_t::operator unsigned long (this=0x30) at /usr/src/debug/ceph-19.2.0-3.fc41.x86_64/src/include/object.h:118
#1 Client::choose_target_mds (this=this@entry=0x15e4ad0, req=req@entry=0x16848e0, phash_diri=phash_diri@entry=0x7fffa46265d8)
at /usr/src/debug/ceph-19.2.0-3.fc41.x86_64/src/client/Client.cc:1735
#2 0x00007f3fb1e7f4fd in Client::make_request (this=this@entry=0x15e4ad0, request=request@entry=0x16848e0, perms=..., ptarget=ptarget@entry=0x7fffa4626be8,
pcreated=pcreated@entry=0x0, use_mds=use_mds@entry=-1, pdirbl=0x0, feature_needed=18446744073709551615)
at /usr/src/debug/ceph-19.2.0-3.fc41.x86_64/src/client/Client.cc:2008
#3 0x00007f3fb1e80989 in Client::_do_lookup (this=this@entry=0x15e4ad0, dir=dir@entry=0x7f3f9800b990, name="volumes", mask=<optimized out>, mask@entry=0,
target=target@entry=0x7fffa4626be8, perms=...) at /usr/src/debug/ceph-19.2.0-3.fc41.x86_64/src/client/Client.cc:7285
#4 0x00007f3fb1e8d68e in Client::_lookup (this=this@entry=0x15e4ad0, dir=dir@entry=0x7f3f9800b990, dname="volumes", mask=0, mask@entry=1,
target=target@entry=0x7fffa4626be8, perms=..., alternate_name=0x0, is_rename=false) at /usr/src/debug/ceph-19.2.0-3.fc41.x86_64/src/client/Client.cc:7419
#5 0x00007f3fb1e8f01c in Client::ll_lookupx (this=0x15e4ad0, parent=0x7f3f9800b990, name=0x402557 "volumes", out=0x7fffa4626d90, stx=0x7fffa4626d00, want=256, flags=0,
perms=...) at /usr/src/debug/ceph-19.2.0-3.fc41.x86_64/src/client/Client.cc:13179
#6 0x000000000040150e in main ()
Files
Updated by Shachar Sharon about 1 year ago · Edited
I was able to reproduce this failure on my setup using recent Ceph (commit: 3162e265) via smbtorture ('smb2.create.delete' test).
As far as I can tell from my analysis, the SIGSEGV happens in 'Client::choose_target_mds' when 'in' becomes NULL1:
[1] https://github.com/ceph/ceph/blob/main/src/client/Client.cc#L1750-L1757
if (in->snapid != CEPH_NOSNAP) {
ldout(cct, 10) << __func__ << " " << *in << " is snapped, using nonsnap parent" << dendl;
while (in->snapid != CEPH_NOSNAP) {
if (in->snapid == CEPH_SNAPDIR)
in = in->snapdir_parent.get();
else if (!in->dentries.empty())
/* In most cases there will only be one dentry, so getting it
* will be the correct action. If there are multiple hard links,
* I think the MDS should be able to redirect as needed*/
in = in->get_first_parent()->dir->parent_inode; // <-- in becomes NULL causing SIGSEGV in next while iteration
else {
ldout(cct, 10) << __func__ << "got unlinked inode, can't look at parent" << dendl;
break;
}
Updated by Venky Shankar about 1 year ago · Edited
@ssharon - do you have the coredump?
EDIT: I see you have a reproducer attached, but it might a bit quicker for me if there is a coredump in hand.
Updated by Shachar Sharon about 1 year ago
When we run smbd in devel mode and get a SIGSEGV, the process enters "halt" mode and we can run gdb -p <smbd-pid> --quiet --batch -ex "thread apply all bt".
In our case, the relevant backtrace is:
Thread 1 (Thread 0x7f2c01c63bc0 (LWP 2239) "smbd"):
#0 0x00007f2c06ad911f in wait4 () from /lib64/libc.so.6
#1 0x00007f2c06a4b4f3 in do_system () from /lib64/libc.so.6
#2 0x00007f2c06fbf220 in call_panic_action (why=0x7fffc39bdcd0 "Signal 11: Segmentation fault", as_root=false) at ../../source3/lib/util.c:713
#3 0x00007f2c06fbf373 in smb_panic_s3 (why=0x7fffc39bdcd0 "Signal 11: Segmentation fault") at ../../source3/lib/util.c:729
#4 0x00007f2c0727ea84 in smb_panic (why=0x7fffc39bdcd0 "Signal 11: Segmentation fault") at ../../lib/util/fault.c:209
#5 0x00007f2c0727e52d in fault_report (sig=11) at ../../lib/util/fault.c:83
#6 0x00007f2c0727e542 in sig_fault (sig=11) at ../../lib/util/fault.c:94
#7 <signal handler called>
#8 0x00007f2bf404d7a7 in snapid_t::operator unsigned long (this=0x30) at /usr/src/debug/ceph-19.3.0-6959.g3162e265.el9.x86_64/src/include/object.h:121
#9 Client::choose_target_mds (phash_diri=<synthetic pointer>, req=<optimized out>, this=0x55b787112570) at /usr/src/debug/ceph-19.3.0-6959.g3162e265.el9.x86_64/src/client/Client.cc:1750
#10 Client::make_request (this=0x55b787112570, request=<optimized out>, perms=..., ptarget=0x0, pcreated=0x0, use_mds=<optimized out>, pdirbl=0x0, feature_needed=18446744073709551615) at /usr/src/debug/ceph-19.3.0-6959.g3162e265.el9.x86_64/src/client/Client.cc:2026
#11 0x00007f2bf407f47c in Client::_getattr (this=0x55b787112570, in=0x7f2bb800ef90, mask=68, perms=..., force=<optimized out>) at /usr/src/debug/ceph-19.3.0-6959.g3162e265.el9.x86_64/src/client/Client.cc:8010
#12 0x00007f2bf407854d in Client::_getattr_for_perm (perms=..., in=0x7f2bb800ef90, this=0x55b787112570) at /usr/src/debug/ceph-19.3.0-6959.g3162e265.el9.x86_64/src/client/Client.cc:6267
#13 Client::may_lookup (this=0x55b787112570, dir=0x7f2bb800ef90, perms=...) at /usr/src/debug/ceph-19.3.0-6959.g3162e265.el9.x86_64/src/client/Client.cc:6158
#14 0x00007f2bf401b782 in Client::ll_lookupx (perms=..., flags=0, want=256, stx=0x7fffc39be910, out=0x7fffc39be908, name=0x55b7871b8ae0 ".", parent=0x7f2bb800ef90, this=0x55b787112570) at /usr/src/debug/ceph-19.3.0-6959.g3162e265.el9.x86_64/src/client/Client.cc:13289
#15 ceph_ll_lookup (cmount=<optimized out>, parent=0x7f2bb800ef90, name=0x55b7871b8ae0 ".", out=0x7fffc39be908, stx=0x7fffc39be910, want=256, flags=0, perms=0x55b786fa00b0) at /usr/src/debug/ceph-19.3.0-6959.g3162e265.el9.x86_64/src/libcephfs.cc:1885
#16 0x00007f2bf4187507 in vfs_ceph_ll_lookup (handle=0x55b786f9f9d0, parent=0x7fffc39bea10, name=0x55b7871b8ae0 ".", iref=0x55b786fce2c8) at ../../source3/modules/vfs_ceph_new.c:1120
#17 0x00007f2bf418ba0f in vfs_ceph_openat (handle=0x55b786f9f9d0, dirfsp=0x55b7870023e0, smb_fname=0x55b7871b89b0, fsp=0x55b7871bbc30, how=0x7fffc39bec50) at ../../source3/modules/vfs_ceph_new.c:2213
#18 0x00007f2c070c92b8 in smb_vfs_call_openat (handle=0x55b786f9f9d0, dirfsp=0x55b7870023e0, smb_fname=0x55b7871b89b0, fsp=0x55b7871bbc30, how=0x7fffc39bec50) at ../../source3/smbd/vfs.c:1603
#19 0x00007f2c07078332 in smb_vfs_openat_ci (mem_ctx=0x55b7871b89b0, case_sensitive=true, conn=0x55b786f65d20, dirfsp=0x55b7870023e0, smb_fname_rel=0x55b7871b89b0, fsp=0x55b7871bbc30, how=0x7fffc39bec50) at ../../source3/smbd/files.c:929
#20 0x00007f2c0707a579 in openat_pathref_fsp_lcomp (dirfsp=0x55b7870023e0, smb_fname_rel=0x55b7871b89b0, ucf_flags=8) at ../../source3/smbd/files.c:1577
#21 0x00007f2c070aa1f8 in filename_convert_dirfsp_nosymlink (mem_ctx=0x55b7871b8490, conn=0x55b786f65d20, basedir=0x55b7870023e0, name_in=0x55b7871ba651 "", ucf_flags=8, twrp=0, _dirfsp=0x7fffc39bef50, _smb_fname=0x7fffc39bef48, _smb_fname_rel=0x7fffc39bef40, _symlink_err=0x7fffc39beeb8) at ../../source3/smbd/filename.c:766
#22 0x00007f2c070aae6f in filename_convert_dirfsp_rel (mem_ctx=0x55b7871b8490, conn=0x55b786f65d20, basedir=0x55b7870023e0, name_in=0x55b7871ba651 "", ucf_flags=8, twrp=0, _dirfsp=0x7fffc39bef50, _smb_fname=0x7fffc39bef48, _smb_fname_rel=0x7fffc39bef40) at ../../source3/smbd/filename.c:1093
#23 0x00007f2c070aca25 in fd_openat (dirfsp=0x55b7870023e0, smb_fname=0x55b7871ba520, fsp=0x55b786fc7f10, _how=0x7fffc39bf0f0) at ../../source3/smbd/open.c:558
#24 0x00007f2c070834cf in OpenDir_from_pathref (mem_ctx=0x55b7871b8490, dirfsp=0x55b786f86570, mask=0x0, attr=0, _dir_hnd=0x7fffc39bf188) at ../../source3/smbd/dir.c:1100
#25 0x00007f2c070a9107 in get_real_filename_full_scan_at (dirfsp=0x55b786f86570, name=0x55b7871b87b0 "smb2_open", mangled=false, mem_ctx=0x55b7871b8490, found_name=0x7fffc39bf330) at ../../source3/smbd/filename.c:319
#26 0x00007f2c070a94f1 in get_real_filename_at (dirfsp=0x55b786f86570, name=0x55b7871b87b0 "smb2_open", mem_ctx=0x55b7871b8490, found_name=0x7fffc39bf330) at ../../source3/smbd/filename.c:404
#27 0x00007f2c07078712 in smb_vfs_openat_ci (mem_ctx=0x55b7871b8490, case_sensitive=false, conn=0x55b786f65d20, dirfsp=0x55b786f86570, smb_fname_rel=0x7fffc39bf330, fsp=0x55b78703b910, how=0x7fffc39bf300) at ../../source3/smbd/files.c:1002
#28 0x00007f2c070793f2 in openat_pathref_fsp_nosymlink (mem_ctx=0x55b786f723f0, conn=0x55b786f65d20, in_dirfsp=0x55b786f86570, path_in=0x55b7871b86d0 "smb2_open", twrp=0, posix=false, _smb_fname=0x7fffc39bf688, _symlink_err=0x7fffc39bf680) at ../../source3/smbd/files.c:1239
#29 0x00007f2c070aa311 in filename_convert_dirfsp_nosymlink (mem_ctx=0x55b786f723f0, conn=0x55b786f65d20, basedir=0x55b786f86570, name_in=0x55b7871bac10 "smb2_open/torture_open_for_delete.txt", ucf_flags=0, twrp=0, _dirfsp=0x7fffc39bfad0, _smb_fname=0x7fffc39bfac8, _smb_fname_rel=0x7fffc39bf7e8, _symlink_err=0x7fffc39bf758) at ../../source3/smbd/filename.c:778
#30 0x00007f2c070aae6f in filename_convert_dirfsp_rel (mem_ctx=0x55b786f723f0, conn=0x55b786f65d20, basedir=0x55b786f86570, name_in=0x55b7871bac10 "smb2_open/torture_open_for_delete.txt", ucf_flags=0, twrp=0, _dirfsp=0x7fffc39bfad0, _smb_fname=0x7fffc39bfac8, _smb_fname_rel=0x7fffc39bf7e8) at ../../source3/smbd/filename.c:1093
#31 0x00007f2c070ab1fb in filename_convert_dirfsp (ctx=0x55b786f723f0, conn=0x55b786f65d20, name_in=0x55b7871bac10 "smb2_open/torture_open_for_delete.txt", ucf_flags=0, twrp=0, _dirfsp=0x7fffc39bfad0, _smb_name=0x7fffc39bfac8) at ../../source3/smbd/filename.c:1200
#32 0x00007f2c07114303 in smbd_smb2_create_send (mem_ctx=0x55b786f723f0, ev=0x55b786f74940, smb2req=0x55b786f723f0, in_oplock_level=0 '\000', in_impersonation_level=0, in_desired_access=65536, in_file_attributes=0, in_share_access=7, in_create_disposition=1, _in_create_options=4160, in_name=0x55b787036560 "smb2_open\\torture_open_for_delete.txt", in_context_blobs=...) at ../../source3/smbd/smb2_create.c:1197
#33 0x00007f2c07111857 in smbd_smb2_request_process_create (smb2req=0x55b786f723f0) at ../../source3/smbd/smb2_create.c:295
#34 0x00007f2c0710085d in smbd_smb2_request_dispatch (req=0x55b786f723f0) at ../../source3/smbd/smb2_server.c:3521
#35 0x00007f2c071067a6 in smbd_smb2_advance_incoming (xconn=0x55b786f8f0e0, n=194) at ../../source3/smbd/smb2_server.c:5144
#36 0x00007f2c07106b71 in smbd_smb2_io_handler (xconn=0x55b786f8f0e0, fde_flags=1) at ../../source3/smbd/smb2_server.c:5259
#37 0x00007f2c07106c29 in smbd_smb2_connection_handler (ev=0x55b786f74940, fde=0x55b786f76660, flags=1, private_data=0x55b786f8f0e0) at ../../source3/smbd/smb2_server.c:5289
#38 0x00007f2c06c9d4a5 in tevent_common_invoke_fd_handler () from /lib64/libtevent.so.0
#39 0x00007f2c06ca155e in epoll_event_loop_once () from /lib64/libtevent.so.0
#40 0x00007f2c06c9882b in std_event_loop_once () from /lib64/libtevent.so.0
#41 0x00007f2c06c9a368 in _tevent_loop_once () from /lib64/libtevent.so.0
#42 0x00007f2c06c9a48b in tevent_common_loop_wait () from /lib64/libtevent.so.0
#43 0x00007f2c06c9889b in std_event_loop_wait () from /lib64/libtevent.so.0
#44 0x00007f2c070e4182 in smbd_process (ev_ctx=0x55b786f74940, msg_ctx=0x55b786f5f940, sock_fd=29, interactive=true) at ../../source3/smbd/smb2_process.c:2163
#45 0x000055b785183722 in smbd_accept_connection (ev=0x55b786f74940, fde=0x55b786f884b0, flags=1, private_data=0x55b786f86050) at ../../source3/smbd/server.c:986
#46 0x00007f2c06c9d4a5 in tevent_common_invoke_fd_handler () from /lib64/libtevent.so.0
#47 0x00007f2c06ca155e in epoll_event_loop_once () from /lib64/libtevent.so.0
#48 0x00007f2c06c9882b in std_event_loop_once () from /lib64/libtevent.so.0
#49 0x00007f2c06c9a368 in _tevent_loop_once () from /lib64/libtevent.so.0
#50 0x00007f2c06c9a48b in tevent_common_loop_wait () from /lib64/libtevent.so.0
#51 0x00007f2c06c9889b in std_event_loop_wait () from /lib64/libtevent.so.0
#52 0x000055b785184720 in smbd_parent_loop (ev_ctx=0x55b786f74940, parent=0x55b786f72b00) at ../../source3/smbd/server.c:1388
#53 0x000055b785187649 in main (argc=5, argv=0x7fffc39c08b8) at ../../source3/smbd/server.c:2360
See attached file for full backtraces.
Do you want me to create and upload full coredump (could be large file).
Updated by Venky Shankar about 1 year ago
@ssharon I think its the way the test code is written that's causing the client crash. When I ran the reproducer in my dev cluster (current HEAD), I see the following in the client log
2025-01-30T09:54:13.921+0000 7f015c62f640 3 client.74136 ll_lookupx 0x1.head /
2025-01-30T09:54:13.921+0000 7f015c62f640 20 client.74136 may_lookup 0x1.head(faked_ino=0 nref=8 ll_ref=2 cap_refs={} open={} mode=41777 size=0/0 nlink=1 btime=2025-01-28T07:45:42.268444+0000 mtime=2025-01-30T09:54:13.903464+0000 ctime=2025-01-30T09:54:13.903464+0000 change_attr=11 caps=pAsLsXsFs(0=pAsLsXsFs) has_dir_layout 0x7f013400aca0); UserPerm(uid: 1145, gid: 1145)
2025-01-30T09:54:13.921+0000 7f015c62f640 10 client.74136 _getattr mask As issued=1
2025-01-30T09:54:13.921+0000 7f015c62f640 3 client.74136 may_lookup 0x7f013400aca0 = 0
2025-01-30T09:54:13.921+0000 7f015c62f640 10 client.74136 _do_lookup on #0x1//
2025-01-30T09:54:13.921+0000 7f015c62f640 20 client.74136 choose_target_mds starting with req->inode 0x1.head(faked_ino=0 nref=9 ll_ref=2 cap_refs={} open={} mode=41777 size=0/0 nlink=1 btime=2025-01-28T07:45:42.268444+0000 mtime=2025-01-30T09:54:13.903464+0000 ctime=2025-01-30T09:54:13.903464+0000 change_attr=11 caps=pAsLsXsFs(0=pAsLsXsFs) has_dir_layout 0x7f013400aca0)
2025-01-30T09:54:13.921+0000 7f015c62f640 20 client.74136 choose_target_mds inode dir hash is 2 on / => 830207683
2025-01-30T09:54:13.921+0000 7f015c62f640 20 client.74136 choose_target_mds 0x1.head(faked_ino=0 nref=9 ll_ref=2 cap_refs={} open={} mode=41777 size=0/0 nlink=1 btime=2025-01-28T07:45:42.268444+0000 mtime=2025-01-30T09:54:13.903464+0000 ctime=2025-01-30T09:54:13.903464+0000 change_attr=11 caps=pAsLsXsFs(0=pAsLsXsFs) has_dir_layout 0x7f013400aca0) is_hash=1 hash=830207683
2025-01-30T09:54:13.921+0000 7f015c62f640 10 client.74136 choose_target_mds from caps on inode 0x1.head(faked_ino=0 nref=9 ll_ref=2 cap_refs={} open={} mode=41777 size=0/0 nlink=1 btime=2025-01-28T07:45:42.268444+0000 mtime=2025-01-30T09:54:13.903464+0000 ctime=2025-01-30T09:54:13.903464+0000 change_attr=11 caps=pAsLsXsFs(0=pAsLsXsFs) has_dir_layout 0x7f013400aca0)
2025-01-30T09:54:13.921+0000 7f015c62f640 20 client.74136 mds is 0
2025-01-30T09:54:13.921+0000 7f015c62f640 10 client.74136 send_request rebuilding request 2 for mds.0
2025-01-30T09:54:13.921+0000 7f015c62f640 20 client.74136 encode_cap_releases enter (req: 0x561138652520, mds: 0)
2025-01-30T09:54:13.921+0000 7f015c62f640 20 client.74136 send_request set sent_stamp to 2025-01-30T09:54:13.922693+0000
2025-01-30T09:54:13.921+0000 7f015c62f640 10 client.74136 send_request client_request(unknown.0:2 lookup #0x1// 2025-01-30T09:54:13.922674+0000 caller_uid=1145, caller_gid=1145{}) to mds.0
Notice this log line
2025-01-30T09:54:13.921+0000 7f015c62f640 10 client.74136 _do_lookup on #0x1//
The // is interpreted as a lookup on snapdir (.snap) by the MDS (MDCache::path_traverse())[+]:
2025-01-30T09:54:13.921+0000 7f1d11d65640 4 mds.0.server handle_client_request client_request(client.74136:2 lookup #0x1// 2025-01-30T09:54:13.922674+0000 caller_uid=1145, caller_gid=1145{})
2025-01-30T09:54:13.921+0000 7f1d11d65640 20 mds.0.105 get_session have 0x55683833ed00 client.74136 172.21.10.1:0/2979710412 state open
2025-01-30T09:54:13.921+0000 7f1d11d65640 15 mds.0.server oldest_client_tid=2
2025-01-30T09:54:13.921+0000 7f1d11d65640 7 mds.0.cache request_start request(client.74136:2 nref=2 cr=0x55683d359c00)
2025-01-30T09:54:13.921+0000 7f1d11d65640 7 mds.0.server dispatch_client_request client_request(client.74136:2 lookup #0x1// 2025-01-30T09:54:13.922674+0000 caller_uid=1145, caller_gid=1145{})
2025-01-30T09:54:13.921+0000 7f1d11d65640 10 mds.0.server rdlock_path_pin_ref request(client.74136:2 nref=2 cr=0x55683d359c00) #0x1//
2025-01-30T09:54:13.921+0000 7f1d11d65640 7 mds.0.cache traverse: opening base ino 0x1 snap head path depth 1
2025-01-30T09:54:13.921+0000 7f1d11d65640 12 mds.0.cache traverse: path seg depth 0 '' snapid head
2025-01-30T09:54:13.921+0000 7f1d11d65640 10 mds.0.cache traverse: snapdir
2025-01-30T09:54:13.921+0000 7f1d11d65640 10 mds.0.cache path_traverse finish on snapid snapdir
2025-01-30T09:54:13.921+0000 7f1d11d65640 10 mds.0.server ref is [inode 0x1 [...2,head] / auth v42 snaprealm=0x55683d218900 f(v0 m2025-01-30T09:54:13.903464+0000 6=0+6) n(v18 rc2025-01-30T09:54:13.903464+0000 b6 19=3+16)/n(v0 rc2025-01-28T07:45:42.268444+0000 1=0+1) (inest lock) caps={74136=pAsLsXsFs/-@1} | request=0 lock=0 dirfrag=1 caps=1 openingsnapparents=0 dirty=1 authpin=0 0x55683837b180]
2025-01-30T09:54:13.921+0000 7f1d11d65640 10 mds.0.cache.ino(0x1) auth_pin by 0x55683d36de00 on [inode 0x1 [...2,head] / auth v42 ap=1 snaprealm=0x55683d218900 f(v0 m2025-01-30T09:54:13.903464+0000 6=0+6) n(v18 rc2025-01-30T09:54:13.903464+0000 b6 19=3+16)/n(v0 rc2025-01-28T07:45:42.268444+0000 1=0+1) (inest lock) caps={74136=pAsLsXsFs/-@1} | request=0 lock=0 dirfrag=1 caps=1 openingsnapparents=0 dirty=1 authpin=1 0x55683837b180] now 1
2025-01-30T09:54:13.921+0000 7f1d11d65640 10 mds.0.locker acquire_locks request(client.74136:2 nref=2 cr=0x55683d359c00)
2025-01-30T09:54:13.921+0000 7f1d11d65640 20 mds.0.locker lov = []
2025-01-30T09:54:13.921+0000 7f1d11d65640 20 mds.0.locker auth_pin_nonblocking=0
2025-01-30T09:54:13.921+0000 7f1d11d65640 20 Session check_access: [inode 0x1 [...2,head] / auth v42 ap=1 snaprealm=0x55683d218900 f(v0 m2025-01-30T09:54:13.903464+0000 6=0+6) n(v18 rc2025-01-30T09:54:13.903464+0000 b6 19=3+16)/n(v0 rc2025-01-28T07:45:42.268444+0000 1=0+1) (inest lock) caps={74136=pAsLsXsFs/-@1} | request=1 lock=0 dirfrag=1 caps=1 openingsnapparents=0 dirty=1 authpin=1 0x55683837b180] caller_uid=1145 caller_gid=1145 caller_gid_list=[]
2025-01-30T09:54:13.921+0000 7f1d11d65640 20 Session check_access path
2025-01-30T09:54:13.921+0000 7f1d11d65640 10 MDSAuthCap is_capable inode(path / owner 1145:1145 mode 041777) by caller 1145:1145 mask 1 new 0:0 cap: MDSAuthCaps[allow *]
2025-01-30T09:54:13.921+0000 7f1d11d65640 10 mds.0.server reply to stat on client_request(client.74136:2 lookup #0x1// 2025-01-30T09:54:13.922674+0000 caller_uid=1145, caller_gid=1145{})
This is followed by the MDS crashing with
ceph version 19.3.0-6837-g62e07cd5965 (62e07cd5965df0903a5137efb2a6573ee245f2f3) squid (dev) 1: /lib64/libc.so.6(+0x3e730) [0x7f1d16a3e730] 2: /lib64/libc.so.6(+0x8bbdc) [0x7f1d16a8bbdc] 3: raise() 4: abort() 5: (boost::asio::detail::thread_info_base::~thread_info_base()+0) [0x556835d4da56] 6: /home/vshankar/ceph/build/bin/ceph-mds(+0x3344cd) [0x556835eb44cd] 7: (Server::handle_client_getattr(boost::intrusive_ptr<MDRequestImpl> const&, bool)+0x1875) [0x556835e99793] 8: (Server::dispatch_client_request(boost::intrusive_ptr<MDRequestImpl> const&)+0xcea) [0x556835ea7056] 9: (Server::handle_client_request(boost::intrusive_ptr<MClientRequest const> const&)+0xd16) [0x556835ea8182]
Interesting thing is in your environment, the client crashes, but the underlying reason is a invalid inode pointer:
while (in->snapid != CEPH_NOSNAP) {
if (in->snapid == CEPH_SNAPDIR) <-- this line
in = in->snapdir_parent.get();
Also, in the full backtrace that you shared, frame 15 is
#15 ceph_ll_lookup (cmount=<optimized out>, parent=0x7f2bb800ef90, name=0x55b7871b8ae0 ".", out=0x7fffc39be908, stx=0x7fffc39be910, want=256, flags=0, perms=0x55b786fa00b0) at /usr/src/debug/ceph-19.3.0-6959.g3162e265.el9.x86_64/src/libcephfs.cc:1885
name is . rather than / that's used in the reproducer (where I see the MDS crashing). Also, just FYI, if I comment out the line that calls:
ceph_ll_lookup(cmount, root, "/", &slash, &stx, CEPH_STATX_INO, 0, perms);
and invoke the lookup of the next path component with root inode, the test passes.
[+]: https://github.com/ceph/ceph/blob/main/src/mds/MDCache.cc#L8397
Updated by Shachar Sharon about 1 year ago
The above backtrace is when I executed libcephfs from within smbd and run smbtorture tests (specifically, 'smb2.create.delete' test).
However, running only the attached reproducer (libcephfs_ll_lookup_crash.c) on my setup gave the same SIGSEGV on the client side, without any mds crash:
root inode = 1
first lookup at / completed
Thread 1 "libcephfs_ll_lo" received signal SIGSEGV, Segmentation fault.
Client::choose_target_mds (phash_diri=<synthetic pointer>, req=<optimized out>, this=0x58c330) at /usr/src/debug/ceph-19.3.0-6959.g3162e265.el9.x86_64/src/client/Client.cc:1750
1750 while (in->snapid != CEPH_NOSNAP) {
(gdb) bt
#0 Client::choose_target_mds (phash_diri=<synthetic pointer>, req=<optimized out>, this=0x58c330) at /usr/src/debug/ceph-19.3.0-6959.g3162e265.el9.x86_64/src/client/Client.cc:1750
#1 Client::make_request (this=0x58c330, request=<optimized out>, perms=..., ptarget=0x7fffffffdcd8, pcreated=0x0, use_mds=<optimized out>, pdirbl=0x0, feature_needed=18446744073709551615)
at /usr/src/debug/ceph-19.3.0-6959.g3162e265.el9.x86_64/src/client/Client.cc:2026
#2 0x00007ffff7ec0971 in Client::_do_lookup (this=this@entry=0x58c330, dir=dir@entry=0x7fffd800d520, name="volumes", mask=<optimized out>, mask@entry=0, target=target@entry=0x7fffffffdcd8, perms=...)
at /usr/src/debug/ceph-19.3.0-6959.g3162e265.el9.x86_64/src/client/Client.cc:7316
#3 0x00007ffff7ec51bb in Client::_lookup (this=0x58c330, dir=0x7fffd800d520, dname="volumes", mask=<optimized out>, target=0x7fffffffdcd8, perms=..., alternate_name=0x0, is_rename=false)
at /usr/src/debug/ceph-19.3.0-6959.g3162e265.el9.x86_64/src/client/Client.cc:7455
#4 0x00007ffff7e5d484 in Client::ll_lookupx (perms=..., flags=0, want=256, stx=0x7fffffffddd0, out=0x7fffffffde60, name=0x40211f "volumes", parent=0x7fffd800d520, this=0x58c330)
at /usr/src/debug/ceph-19.3.0-6959.g3162e265.el9.x86_64/src/client/Client.cc:13298
#5 ceph_ll_lookup (cmount=<optimized out>, parent=0x7fffd800d520, name=0x40211f "volumes", out=0x7fffffffde60, stx=0x7fffffffddd0, want=256, flags=0, perms=0x446d58) at /usr/src/debug/ceph-19.3.0-6959.g3162e265.el9.x86_64/src/libcephfs.cc:1885
#6 0x000000000040150e in main (argc=3, argv=0x7fffffffdfa8) at libcephfs_ll_lookup_crash.c:71
(gdb) info threads
Id Target Id Frame
* 1 Thread 0x7ffff6b1c540 (LWP 2370) "libcephfs_ll_lo" Client::choose_target_mds (phash_diri=<synthetic pointer>, req=<optimized out>, this=0x58c330) at /usr/src/debug/ceph-19.3.0-6959.g3162e265.el9.x86_64/src/client/Client.cc:1750
2 Thread 0x7ffff52df640 (LWP 2373) "log" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x4d0a94) at futex-internal.c:57
3 Thread 0x7ffff4ade640 (LWP 2374) "io_context_pool" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x4a4fec) at futex-internal.c:57
4 Thread 0x7ffff42dd640 (LWP 2375) "io_context_pool" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x4a4fec) at futex-internal.c:57
5 Thread 0x7ffff3adc640 (LWP 2376) "msgr-worker-0" 0x00007ffff7b0e95e in epoll_wait (epfd=3, events=0x500420, maxevents=5000, timeout=30000) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
6 Thread 0x7ffff32db640 (LWP 2377) "msgr-worker-1" 0x00007ffff7b0e95e in epoll_wait (epfd=6, events=0x52c360, maxevents=5000, timeout=30000) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
7 Thread 0x7ffff2ada640 (LWP 2378) "msgr-worker-2" 0x00007ffff7b0e95e in epoll_wait (epfd=9, events=0x5582a0, maxevents=5000, timeout=30000) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
11 Thread 0x7ffff22d9640 (LWP 2382) "service" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x7ffff22d7000, op=137, expected=0, futex_word=0x7fffe8004858) at futex-internal.c:57
12 Thread 0x7ffff1ad8640 (LWP 2383) "ceph_timer" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x7ffff1ad5fe0, op=137, expected=0, futex_word=0x5887dc) at futex-internal.c:57
13 Thread 0x7ffff12d7640 (LWP 2384) "fn_anonymous" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x58fc88) at futex-internal.c:57
14 Thread 0x7ffff5d67640 (LWP 2385) "safe_timer" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x58c3b0) at futex-internal.c:57
15 Thread 0x7ffff0ad6640 (LWP 2386) "fn_anonymous" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x58cd78) at futex-internal.c:57
16 Thread 0x7fffdffff640 (LWP 2387) "flusher" futex_wait (private=0, expected=2, futex_word=0x58c4b0) at ../sysdeps/nptl/futex-internal.h:146
17 Thread 0x7fffdf7fe640 (LWP 2388) "ms_dispatch" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x586504) at futex-internal.c:57
18 Thread 0x7fffdeffd640 (LWP 2389) "ms_local" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x586680) at futex-internal.c:57
19 Thread 0x7fffde7fc640 (LWP 2390) "safe_timer" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x7fffde7f9fb0, op=137, expected=0, futex_word=0x4f63b0) at futex-internal.c:57
20 Thread 0x7fffddffb640 (LWP 2391) "libcephfs_ll_lo" futex_wait (private=0, expected=2, futex_word=0x58c4b0) at ../sysdeps/nptl/futex-internal.h:146
(gdb)
Updated by Anoop C S about 1 year ago
The // is interpreted as a lookup on snapdir (.snap) by the MDS
I just want to extend that be it within a subvol this is found to be true i.e, if I do a chdir to a subvolume well before any ceph_ll_* calls in the reproducer, the result is same. So at least to avoid the crash we may have to handle this special case in some graceful(and sensible) way and rest can be dealt from the application side I guess.
Updated by Venky Shankar about 1 year ago
- Priority changed from Normal to Urgent
Venky Shankar wrote in #note-7:
I will have a look again. Thx!
Could not get to this today. I'll try to get to this at the earliest, or might find someone to continue on this.
Updated by Venky Shankar about 1 year ago
- Status changed from New to Triaged
- Assignee set to Venky Shankar
- Backport set to reef,squid
Updated by Venky Shankar about 1 year ago
I couldn't not test out the fix today and the current HEAD is failing monitor boot up (unrelated issue). Will get this done once that gets fixed.
Updated by Venky Shankar about 1 year ago
- Status changed from Triaged to Fix Under Review
- Pull request ID set to 61838
Updated by Venky Shankar 10 months ago
- Status changed from Fix Under Review to Closed
Resolved by another change.