Bug #73037
openlibcephfs assertion-failure when using ceph_ll_nonblocking_readv_writev for fsync
0%
Description
While testing Samba (vfs_ceph_new) + libcephfs low-level API + fio we hit the following assertion failure:
(gdb) bt #0 0x00007f1424eda17f in __GI___wait4 (pid=31343, stat_loc=stat_loc@entry=0x7fffcec42528, options=options@entry=0, usage=usage@entry=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:30 #1 0x00007f1424eda0fb in __GI___waitpid (pid=<optimized out>, stat_loc=stat_loc@entry=0x7fffcec42528, options=options@entry=0) at waitpid.c:38 #2 0x00007f1424e4f483 in do_system (line=<optimized out>) at ../sysdeps/posix/system.c:171 #3 0x00007f1425775415 in call_panic_action (why=0x7fffcec42950 "Signal 6: Aborted", as_root=false) at ../../source3/lib/util.c:713 #4 0x00007f1425775568 in smb_panic_s3 (why=0x7fffcec42950 "Signal 6: Aborted") at ../../source3/lib/util.c:729 #5 0x00007f142512fa84 in smb_panic (why=0x7fffcec42950 "Signal 6: Aborted") at ../../lib/util/fault.c:209 #6 0x00007f142512f52d in fault_report (sig=6) at ../../lib/util/fault.c:83 #7 0x00007f142512f542 in sig_fault (sig=6) at ../../lib/util/fault.c:94 #8 <signal handler called> #9 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44 #10 0x00007f1424e8d0a3 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78 #11 0x00007f1424e3fb86 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 #12 0x00007f1424e29873 in __GI_abort () at abort.c:79 #13 0x00007f14117a0eb0 in ceph::__ceph_assert_fail(char const*, char const*, int, char const*) () from /usr/lib64/ceph/libceph-common.so.2 #14 0x00007f1411832efb in Striper::file_to_extents(ceph::common::CephContext*, file_layout_t const*, unsigned long, unsigned long, unsigned long, unsigned long, boost::container::small_vector<striper::LightweightObjectExtent, 4ul, void, void>*) [clone .c old] () from /usr/lib64/ceph/libceph-common.so.2 #15 0x00007f1411c6ebf8 in Striper::file_to_extents(ceph::common::CephContext*, char const*, file_layout_t const*, unsigned long, unsigned long, unsigned long, std::vector<ObjectExtent, std::allocator<ObjectExtent> >&, unsigned long) () from /usr/lib64/ceph/libceph-common.so.2 #16 0x00007f14122d7817 in Client::_flush_range(Inode*, long, unsigned long) () from /lib64/libcephfs.so.2 #17 0x00007f14123031e0 in Client::_read(Fh*, long, unsigned long, ceph::buffer::v15_2_0::list*, Context*) () from /lib64/libcephfs.so.2 #18 0x00007f141236621e in Client::_preadv_pwritev_locked(Fh*, iovec const*, int, long, bool, bool, Context*, ceph::buffer::v15_2_0::list*, bool, bool) [clone .constprop.0] () from /lib64/libcephfs.so.2 #19 0x00007f141228de26 in ceph_ll_nonblocking_readv_writev () from /lib64/libcephfs.so.2 #20 0x00007f14124a5d53 in vfs_ceph_ll_nonblocking_readv_writev (handle=0x562da99746e0, cfh=0x562da9b6e700, io_info=0x562db90bf588) at ../../source3/modules/vfs_ceph_new.c:1952 #21 0x00007f14124a8a57 in vfs_ceph_aio_submit (handle=0x562da99746e0, req=0x562db90bf380, ev=0x562da99456c0) at ../../source3/modules/vfs_ceph_new.c:2590 #22 0x00007f14124aad12 in vfs_ceph_fsync_send (handle=0x562da99746e0, mem_ctx=0x562db04a2860, ev=0x562da99456c0, fsp=0x562da9987820) at ../../source3/modules/vfs_ceph_new.c:3027 #23 0x00007f14254d0003 in smb_vfs_call_fsync_send (handle=0x562da99746e0, mem_ctx=0x562db04b2ad0, ev=0x562da99456c0, fsp=0x562da9987820) at ../../source3/smbd/vfs.c:1848 #24 0x00007f1425521ebd in smbd_smb2_flush_send (mem_ctx=0x562db04b24b0, ev=0x562da99456c0, smb2req=0x562db04b24b0, fsp=0x562da9987820) at ../../source3/smbd/smb2_flush.c:196 #25 0x00007f142552192c in smbd_smb2_request_process_flush (req=0x562db04b24b0) at ../../source3/smbd/smb2_flush.c:61 #26 0x00007f1425507450 in smbd_smb2_request_dispatch (req=0x562db04b24b0) at ../../source3/smbd/smb2_server.c:3539 #27 0x00007f142550e30a in smbd_smb2_advance_incoming (xconn=0x562da9948760, n=88) at ../../source3/smbd/smb2_server.c:5191 #28 0x00007f142550e6d5 in smbd_smb2_io_handler (xconn=0x562da9948760, fde_flags=1) at ../../source3/smbd/smb2_server.c:5306 #29 0x00007f142550e78d in smbd_smb2_connection_handler (ev=0x562da99456c0, fde=0x562da995a360, flags=1, private_data=0x562da9948760) at ../../source3/smbd/smb2_server.c:5336 #30 0x00007f1425391af3 in tevent_common_invoke_fd_handler (fde=0x562da995a360, flags=1, removed=0x0) at ../../lib/tevent/tevent_fd.c:174 #31 0x00007f142539e1a6 in epoll_event_loop (epoll_ev=0x562da9950450, tvalp=0x7fffcec44ad0) at ../../lib/tevent/tevent_epoll.c:699 #32 0x00007f142539e893 in epoll_event_loop_once (ev=0x562da99456c0, location=0x7f142560ce30 "../../source3/smbd/smb2_process.c:2178") at ../../lib/tevent/tevent_epoll.c:929 #33 0x00007f1425399ef3 in std_event_loop_once (ev=0x562da99456c0, location=0x7f142560ce30 "../../source3/smbd/smb2_process.c:2178") at ../../lib/tevent/tevent_standard.c:110 #34 0x00007f1425390492 in _tevent_loop_once (ev=0x562da99456c0, location=0x7f142560ce30 "../../source3/smbd/smb2_process.c:2178") at ../../lib/tevent/tevent.c:860 #35 0x00007f14253907d7 in tevent_common_loop_wait (ev=0x562da99456c0, location=0x7f142560ce30 "../../source3/smbd/smb2_process.c:2178") at ../../lib/tevent/tevent.c:989 #36 0x00007f1425399f98 in std_event_loop_wait (ev=0x562da99456c0, location=0x7f142560ce30 "../../source3/smbd/smb2_process.c:2178") at ../../lib/tevent/tevent_standard.c:141 #37 0x00007f142539087a in _tevent_loop_wait (ev=0x562da99456c0, location=0x7f142560ce30 "../../source3/smbd/smb2_process.c:2178") at ../../lib/tevent/tevent.c:1008 #38 0x00007f14254e9ec5 in smbd_process (ev_ctx=0x562da99456c0, msg_ctx=0x562da9930510, sock_fd=32, interactive=false, transport_type=SMB_TRANSPORT_TYPE_TCP) at ../../source3/smbd/smb2_process.c:2178 #39 0x0000562da863c106 in smbd_accept_connection (ev=0x562da99456c0, fde=0x562da9958ec0, flags=1, private_data=0x562da994b8d0) at ../../source3/smbd/server.c:1159 #40 0x00007f1425391af3 in tevent_common_invoke_fd_handler (fde=0x562da9958ec0, flags=1, removed=0x0) at ../../lib/tevent/tevent_fd.c:174 #41 0x00007f142539e1a6 in epoll_event_loop (epoll_ev=0x562da9945900, tvalp=0x7fffcec44f40) at ../../lib/tevent/tevent_epoll.c:699 #42 0x00007f142539e893 in epoll_event_loop_once (ev=0x562da99456c0, location=0x562da8642c08 "../../source3/smbd/server.c:1556") at ../../lib/tevent/tevent_epoll.c:929 #43 0x00007f1425399ef3 in std_event_loop_once (ev=0x562da99456c0, location=0x562da8642c08 "../../source3/smbd/server.c:1556") at ../../lib/tevent/tevent_standard.c:110 #44 0x00007f1425390492 in _tevent_loop_once (ev=0x562da99456c0, location=0x562da8642c08 "../../source3/smbd/server.c:1556") at ../../lib/tevent/tevent.c:860 #45 0x00007f14253907d7 in tevent_common_loop_wait (ev=0x562da99456c0, location=0x562da8642c08 "../../source3/smbd/server.c:1556") at ../../lib/tevent/tevent.c:989 #46 0x00007f1425399f98 in std_event_loop_wait (ev=0x562da99456c0, location=0x562da8642c08 "../../source3/smbd/server.c:1556") at ../../lib/tevent/tevent_standard.c:141 #47 0x00007f142539087a in _tevent_loop_wait (ev=0x562da99456c0, location=0x562da8642c08 "../../source3/smbd/server.c:1556") at ../../lib/tevent/tevent.c:1008 #48 0x0000562da863ceac in smbd_parent_loop (ev_ctx=0x562da99456c0, parent=0x562da9934b10) at ../../source3/smbd/server.c:1556 #49 0x0000562da8640264 in main (argc=3, argv=0x7fffcec45588) at ../../source3/smbd/
Currently, Samaba is using ceph_ll_nonblocking_readv_writev for async-fsync operations (instead of the newly introduced ceph_ll_nonblocking_fsync), with zero-length. This in turn, causes libcephfs to crash (should return some-kind of error code).
Note that the same fio run works fine with older versions on libcephfs.
Updated by Anoop C S 6 months ago
To add some clarity (after an offline discussion):
It appears that the issue is reproducible with squid (19.2.3). After some thoughts I realized that ceph_ll_nonblocking_readv_writev() is not present with earlier versions (<= reef). Shachar couldn't reproduce with reef just because it was not available and Samba might have used the fallback with ceph_ll_fsync().
The bottom line is that the fsync issue might have been present since the origin of ceph_ll_nonblocking_readv_writev() in squid. This can be treated as an issue independent of the newly introduced ceph_ll_nonblocking_fsync() API.
Updated by Venky Shankar 6 months ago
- Category set to Correctness/Safety
- Status changed from New to Triaged
- Assignee set to Dhairya Parmar
- Target version set to v21.0.0
- Source set to other
- Backport set to tentacle
- Severity changed from 3 - minor to 2 - major
- Labels (FS) crash added
Dhairya - please take this one.
Updated by Venky Shankar 6 months ago
- Description updated (diff)
I think this is with the main branch and a side effect of the nonblocking fsync PR (to add tests) that is tripping things off.
Updated by Anoop C S 6 months ago
Venky Shankar wrote in #note-3:
I think this is with the main branch and a side effect of the nonblocking fsync PR (to add tests) that is tripping things off.
May be not because as I indicated earlier in #note-1, the issue is also seen with squid where those changes to add nonblocking fsync PR are not yet available.
Updated by Venky Shankar 6 months ago
Anoop C S wrote in #note-4:
Venky Shankar wrote in #note-3:
I think this is with the main branch and a side effect of the nonblocking fsync PR (to add tests) that is tripping things off.
May be not because as I indicated earlier in #note-1, the issue is also seen with squid where those changes to add nonblocking fsync PR are not yet available.
Noted. I didn't see that updated comment before posting. Thanks for clarifying.
Updated by Dhairya Parmar 6 months ago
@ssharon, so the fio when run with zero length buffers fails but passes with non-zero length buffers?
Updated by Shachar Sharon 6 months ago
Dhairya Parmar wrote in #note-6:
@ssharon, so the fio when run with zero length buffers fails but passes with non-zero length buffers?
Fio does not use zero-length buffer; instead, Samba's vfs_ceph_new uses ceph_ll_nonblocking_readv_writev for fsync operations (with zero length). This could be due to confusion on how to do async-fsync (before the ceph_ll_nonblocking_fsync API existed).
Put in other words: when a user calls ceph_ll_nonblocking_readv_writev with ceph_ll_io_info.fsync = true and ceph_ll_io_info.iovcnt = 1 but with zero-length we get an assertion failure deep within libcephfs. Assuming this is not a valid settings, it should be rejected at API level and not panic.
Updated by Venky Shankar 6 months ago
Spoke to Dhairya regarding this and the bug is a missing check for 0 byte read length when using O_RSYNC which ends up calling into Striper that expect length > 0.
Updated by Dhairya Parmar 6 months ago
- Status changed from Triaged to Fix Under Review
- Pull request ID set to 65656