Bug #56288
closedcrash: Client::_readdir_cache_cb(dir_result_t*, int (*)(void*, dirent*, ceph_statx*, long, Inode*), void*, int, bool)
0%
Description
Sanitized backtrace:
Client::_readdir_cache_cb(dir_result_t*, int (*)(void*, dirent*, ceph_statx*, long, Inode*), void*, int, bool)
Client::readdir_r_cb(dir_result_t*, int (*)(void*, dirent*, ceph_statx*, long, Inode*), void*, unsigned int, unsigned int, bool)
Crash dump sample:
{
"backtrace": [
"__kernel_rt_sigreturn()",
"(Client::_readdir_cache_cb(dir_result_t*, int (*)(void*, dirent*, ceph_statx*, long, Inode*), void*, int, bool)+0x37c) [0xaaaaca9854b4]",
"(Client::readdir_r_cb(dir_result_t*, int (*)(void*, dirent*, ceph_statx*, long, Inode*), void*, unsigned int, unsigned int, bool)+0x9f0) [0xaaaaca9865d0]",
"ceph-fuse(+0x9f37c) [0xaaaaca93737c]",
"/lib/aarch64-linux-gnu/libfuse.so.2(+0x14068) [0xffffba485068]",
"/lib/aarch64-linux-gnu/libfuse.so.2(+0x15064) [0xffffba486064]",
"/lib/aarch64-linux-gnu/libfuse.so.2(+0x12158) [0xffffba483158]",
"/lib/aarch64-linux-gnu/libpthread.so.0(+0x751c) [0xffffb997b51c]",
"/lib/aarch64-linux-gnu/libc.so.6(+0xd122c) [0xffffb96b522c]"
],
"ceph_version": "17.2.0",
"crash_id": "2022-05-06T09:38:25.862101Z_96a4e7ca-adf6-4a56-acd2-0731016b50ca",
"entity_name": "client.779720a56c617f4713d46a0389a5f0b5c78d2903",
"os_id": "ubuntu",
"os_name": "Ubuntu",
"os_version": "20.04.4 LTS (Focal Fossa)",
"os_version_id": "20.04",
"process_name": "ceph-fuse",
"stack_sig": "915ad90d349e6a333f04e0504a80cafc8ba5777680640de0597c7b0275ea1dc4",
"timestamp": "2022-05-06T09:38:25.862101Z",
"utsname_machine": "aarch64",
"utsname_release": "5.4.0-1059-raspi",
"utsname_sysname": "Linux",
"utsname_version": "#67-Ubuntu SMP PREEMPT Mon Apr 11 14:16:01 UTC 2022"
}
Files
Updated by Telemetry Bot over 3 years ago
Updated by Venky Shankar over 3 years ago
- Category set to Correctness/Safety
- Status changed from New to Triaged
- Assignee set to Venky Shankar
- Target version set to v18.0.0
- Backport set to pacific,quincy
- Component(FS) Client added
- Labels (FS) crash added
Updated by Milind Changire over 2 years ago
Updated by Venky Shankar over 2 years ago
Milind Changire wrote:
Thanks for the logs. I'll have a look.
Updated by Venky Shankar over 2 years ago
- Backport changed from pacific,quincy to reef,quincy,pacific
Updated by Milind Changire over 2 years ago
Venky,
The upstream user has also sent across debug (level 20) logs for ceph-fuse as well as mds.
Unfortunately, the size is more tham 1000KB and so they can't be attached to the tracker.
Let me know when you need them.
Updated by Venky Shankar over 2 years ago
Milind Changire wrote:
Venky,
The upstream user has also sent across debug (level 20) logs for ceph-fuse as well as mds.
Unfortunately, the size is more tham 1000KB and so they can't be attached to the tracker.
Let me know when you need them.
Would help. Please upload using ceph-post-file.
Updated by lei liu about 2 years ago
We recently encountered a similar issue, may I ask if there is a solution?
Updated by Venky Shankar about 2 years ago
lei liu wrote:
We recently encountered a similar issue, may I ask if there is a solution?
Which version was this hit in? 17.2.*?
Updated by Venky Shankar about 2 years ago
Venky Shankar wrote:
lei liu wrote:
We recently encountered a similar issue, may I ask if there is a solution?
Which version was this hit in? 17.2.*?
Is there a coredump that can be shared?
Updated by Venky Shankar about 2 years ago
lei liu wrote:
We recently encountered a similar issue, may I ask if there is a solution?
For now, restart ceph-fuse and that should get you past this.
Updated by Venky Shankar about 2 years ago
I think this happens when there are concurrent lookups and deletes under a directory. _readdir_cache_cb() has code like
int idx = pd - dir->readdir_cache.begin();
if (dn->inode->is_dir() && cct->_conf->client_dirsize_rbytes) {
mask |= CEPH_STAT_RSTAT;
}
int r = _getattr(dn->inode, mask, dirp->perms);
if (r < 0)
return r;
// the content of readdir_cache may change after _getattr(), so pd may be invalid iterator
pd = dir->readdir_cache.begin() + idx;
if (pd >= dir->readdir_cache.end() || *pd != dn)
return -CEPHFS_EAGAIN;
... to handle the cache getting modified when the client_lock gets dropped in _getattr() (via add_update_inode()). I'll reproduce this and see what's going on.
If you have a core to share, it'll speed you debugging. Thx!
Updated by lei liu about 2 years ago
Venky Shankar wrote:
lei liu wrote:
We recently encountered a similar issue, may I ask if there is a solution?
Which version was this hit in? 17.2.*?
version: ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable);
For some reason, we did not save the core file. Currently, we only have some dmesg information:* ganesha.nfsd24416: segfault at 90 ip 00007f79f8731227 sp 00007f78037f24d0 error 4 in libcephfs.so.2.0.0[7f79f86d3000+e2000]*. They seem to point to [https://github.com/ceph/ceph/blob/v15.2.17/src/client/Client.cc#L8171]. In our environment, this issue has occurred three times in the past three weeks, but I have not yet found a way to reproduce this issue.
Updated by Venky Shankar about 2 years ago
lei liu wrote:
Venky Shankar wrote:
lei liu wrote:
We recently encountered a similar issue, may I ask if there is a solution?
Which version was this hit in? 17.2.*?
version: ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable);
For some reason, we did not save the core file. Currently, we only have some dmesg information:* ganesha.nfsd24416: segfault at 90 ip 00007f79f8731227 sp 00007f78037f24d0 error 4 in libcephfs.so.2.0.0[7f79f86d3000+e2000]*. They seem to point to [https://github.com/ceph/ceph/blob/v15.2.17/src/client/Client.cc#L8171]. In our environment, this issue has occurred three times in the past three weeks, but I have not yet found a way to reproduce this issue.
Ok, yeh. I doubted that part (mentioned in note-14).
Updated by Venky Shankar almost 2 years ago
I haven't been unable to reproduce this with the main branch. If possible, please collect ceph-mds coredump and attach it to this tracker.
Updated by Venky Shankar almost 2 years ago
So, for some reason this part of the code
if (pd >= dir->readdir_cache.end() || *pd != dn)
return -CEPHFS_EAGAIN;
especially dereferencing @pd` is causing the crash. I don't have a core to validate that, but none of the other checks seem too wild to be causing this.
Updated by lei liu almost 2 years ago
Venky Shankar wrote in #note-18:
So, for some reason this part of the code
[...]
especially dereferencing @pd` is causing the crash. I don't have a core to validate that, but none of the other checks seem too wild to be causing this.
thank you, I will send you the core file when it happens again.
Updated by tod chen almost 2 years ago · Edited
maybe i found how to reproduce this case and fix it.
the coredump stack is:
Core was generated by `/usr/bin/ganesha.nfsd -L /jfsexports/logs/ganesha/ganesha.log -f /etc/ganesha/g'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f246b27a227 in Client::_readdir_cache_cb (this=<optimized out>, dirp=0x7f23f5849f80, cb=0x7f246b232ac0 <_readdir_single_dirent_cb(void*, dirent*, ceph_statx*, off_t, Inode*)>, p=0x7f24661f1730,
caps=341, getref=true) at /usr/include/c++/8/bits/stl_iterator.h:783
783 /usr/include/c++/8/bits/stl_iterator.h: No such file or directory.
[Current thread is 1 (Thread 0x7f24661f5700 (LWP 1338076))]
(gdb) bt
#0 0x00007f246b27a227 in Client::_readdir_cache_cb (this=<optimized out>, dirp=0x7f23f5849f80, cb=0x7f246b232ac0 <_readdir_single_dirent_cb(void*, dirent*, ceph_statx*, off_t, Inode*)>, p=0x7f24661f1730,
caps=341, getref=true) at /usr/include/c++/8/bits/stl_iterator.h:783
#1 0x00007f246b27bd84 in Client::readdir_r_cb (this=0x7f247694f000, d=<optimized out>, cb=0x7f246b232ac0 <_readdir_single_dirent_cb(void*, dirent*, ceph_statx*, off_t, Inode*)>, p=0x7f24661f1730,
want=<optimized out>, flags=<optimized out>, getref=true) at ./src/client/Client.cc:8315
#2 0x00007f246b27c39d in Client::readdirplus_r (this=<optimized out>, d=<optimized out>, de=de@entry=0x7f24661f1960, stx=stx@entry=0x7f24661f17d0, want=want@entry=5871, flags=flags@entry=0,
out=0x7f24661f17b8) at ./src/client/Client.cc:8488
#3 0x00007f246b229b58 in ceph_readdirplus_r (cmount=<optimized out>, dirp=<optimized out>, de=de@entry=0x7f24661f1960, stx=stx@entry=0x7f24661f17d0, want=want@entry=5871, flags=flags@entry=0,
out=0x7f24661f17b8) at ./src/libcephfs.cc:643
#4 0x00007f246b356b5f in fsal_ceph_readdirplus (dir=<optimized out>, cred=<optimized out>, out=0x7f24661f17b8, flags=0, want=5871, stx=0x7f24661f17d0, de=0x7f24661f1960, dirp=<optimized out>,
cmount=<optimized out>) at ./src/FSAL/FSAL_CEPH/statx_compat.h:315
#5 ceph_fsal_readdir (dir_pub=<optimized out>, whence=<optimized out>, dir_state=0x7f24661f1b00, cb=0x7f2477e7e090 <mdc_readdir_chunked_cb>, attrmask=9223372036906278894, eof=0x7f24661f1c1f)
at ./src/FSAL/FSAL_CEPH/handle.c:258
#6 0x00007f2477e786b9 in mdcache_populate_dir_chunk (directory=0x7f23c694b000, whence=<optimized out>, dirent=0x7f24661f1c20, prev_chunk=<optimized out>, eod_met=0x7f24661f1c1f)
at ./src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:2685
#7 0x00007f2477e79c93 in mdcache_readdir_chunked (directory=0x7f23c694b000, whence=0, dir_state=0x7f24661f1d90, cb=0x7f2477d78d10 <populate_dirent>, attrmask=122830, eod_met=0x7f24661f1e7b)
at ./src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:3049
#8 0x00007f2477d7a9bf in fsal_readdir (directory=directory@entry=0x7f23c694b038, cookie=<optimized out>, cookie@entry=0, nbfound=nbfound@entry=0x7f24661f1e7c, eod_met=eod_met@entry=0x7f24661f1e7b,
attrmask=122830, cb=cb@entry=0x7f2477e38190 <nfs4_readdir_callback>, opaque=0x7f24661f1f80) at ./src/FSAL/fsal_helper.c:1394
#9 0x00007f2477e39543 in nfs4_op_readdir (op=0x7f1c4d43e120, data=0x7f23e72e5f00, resp=0x7f2438b86340) at ./src/Protocols/NFS/nfs4_op_readdir.c:666
#10 0x00007f2477e20b26 in process_one_op (data=data@entry=0x7f23e72e5f00, status=status@entry=0x7f24661f214c) at ./src/Protocols/NFS/nfs4_Compound.c:918
#11 0x00007f2477e21f48 in nfs4_Compound (arg=<optimized out>, req=0x7f2438a0f000, res=<optimized out>) at ./src/Protocols/NFS/nfs4_Compound.c:1383
#12 0x00007f2477d9b005 in nfs_rpc_process_request (reqdata=0x7f2438a0f000, retry=<optimized out>) at ./src/MainNFSD/nfs_worker_thread.c:1507
#13 0x00007f2477d1d5ab in svc_request (xprt=0x7f23bbdaf400, xdrs=0x7f233a0d1a80) at ./src/svc_rqst.c:1209
#14 0x00007f2477d1a8b1 in svc_rqst_xprt_task_recv (wpe=<optimized out>) at ./src/svc_rqst.c:1183
#15 0x00007f2477d1b298 in svc_rqst_epoll_loop (wpe=0x7f2476956600) at ./src/svc_rqst.c:1571
#16 0x00007f2477d26748 in work_pool_thread (arg=0x7f24396000a0) at ./src/work_pool.c:183
#17 0x00007f24775d2fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#18 0x00007f247732c06f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(gdb) f 0
#0 0x00007f246b27a227 in Client::_readdir_cache_cb (this=<optimized out>, dirp=0x7f23f5849f80, cb=0x7f246b232ac0 <_readdir_single_dirent_cb(void*, dirent*, ceph_statx*, off_t, Inode*)>, p=0x7f24661f1730,
caps=341, getref=true) at /usr/include/c++/8/bits/stl_iterator.h:783
783 in /usr/include/c++/8/bits/stl_iterator.h
(gdb) p dirp->inode
$1 = {px = 0x7f1cda4fa600}
(gdb) p ((Inode *)0x7f1cda4fa600)->dir
$2 = (Dir *) 0x0
I think the process that caused the problem is as follows:
1. Thread A attempts to traverse the directory by readdir, and the subitems in this directory are already complete(I_COMPLETE|I_DIR_ORDERED). Therefore, it reaches _readdir_cache_cb and calls _getattr to make a request to mds. It then waits for mds's response and releases the client_lock lock during the waiting process.(maybe the response from mds is slow for huge stress)
2. The timer thread(tick()) will try to trim_cache every second, so it may progress to
trim_cache -> trim_dentry -> unlink -> close_dir
This process results in releasing all dentries in the directory and release the memory that dirp >inode>dir pointed
void Client::close_dir(Dir *dir)
{
Inode *in = dir->parent_inode;
ldout(cct, 15) << __func__ << " dir " << dir << " on " << in << dendl;
ceph_assert(dir->is_empty());
ceph_assert(in->dir == dir);
ceph_assert(in->dentries.size() < 2); // dirs can't be hard-linked
if (!in->dentries.empty())
in->get_first_parent()->put(); // unpin dentry
delete in->dir;
in->dir = 0;
put_inode(in); // unpin inode
}
3. so when the getattr returns in _readdir_cache_cb(Thread A), the memory that dir pointed is freeed
how to fix it:
int r = _getattr(dn->inode, mask, dirp->perms);
if (r < 0)
return r;
+ if (dirp->inode->dir == NULL) {
+ ldout(cct, 0) << " dir is closed, so we should return" << dendl;
+ return -EAGAIN;
+ }
// the content of readdir_cache may change after _getattr(), so pd may be invalid iterator
pd = dir->readdir_cache.begin() + idx;
if (pd >= dir->readdir_cache.end() || *pd != dn)
return -EAGAIN;
Updated by Venky Shankar almost 2 years ago
tod chen wrote in #note-20:
maybe i found how to reproduce this case and fix it.
the coredump stack is:
[...]
I think the process that caused the problem is as follows:
1. Thread A attempts to traverse the directory by readdir, and the subitems in this directory are already complete(I_COMPLETE|I_DIR_ORDERED). Therefore, it reaches _readdir_cache_cb and calls _getattr to make a request to mds. It then waits for mds's response and releases the client_lock lock during the waiting process.(maybe the response from mds is slow for huge stress)
2. The timer thread(tick()) will try to trim_cache every second, so it may progress to
trim_cache -> trim_dentry -> unlink -> close_dir
This process results in releasing all dentries in the directory and release the memory that dirp
>inode>dir pointed[...]
3. so when the getattr returns in _readdir_cache_cb(Thread A), the memory that dir pointed is freeed
I guess you rightly found the issue. Having a coredump to analyze speeds things up greatly. Would you be willing to send a PR?
Updated by tod chen almost 2 years ago
Venky Shankar wrote in #note-21:
tod chen wrote in #note-20:
maybe i found how to reproduce this case and fix it.
the coredump stack is:
[...]
I think the process that caused the problem is as follows:
1. Thread A attempts to traverse the directory by readdir, and the subitems in this directory are already complete(I_COMPLETE|I_DIR_ORDERED). Therefore, it reaches _readdir_cache_cb and calls _getattr to make a request to mds. It then waits for mds's response and releases the client_lock lock during the waiting process.(maybe the response from mds is slow for huge stress)
2. The timer thread(tick()) will try to trim_cache every second, so it may progress to
trim_cache -> trim_dentry -> unlink -> close_dir
This process results in releasing all dentries in the directory and release the memory that dirp
>inode>dir pointed[...]
3. so when the getattr returns in _readdir_cache_cb(Thread A), the memory that dir pointed is freeed
I guess you rightly found the issue. Having a coredump to analyze speeds things up greatly. Would you be willing to send a PR?
ok,i will sned the pr soon
Updated by tod chen over 1 year ago
Updated by Venky Shankar over 1 year ago
- Status changed from Triaged to Fix Under Review
- Target version set to v20.0.0
- Backport changed from reef,quincy,pacific to quincy,reef,squid
- Pull request ID set to 58216
tod chen wrote in #note-23:
@Venky Shankar,
hello
this is the pr: https://github.com/ceph/ceph/pull/58216
:thumbs-up:
Updated by Venky Shankar over 1 year ago
- Status changed from Fix Under Review to Pending Backport
Thanks for the fix @hit1944, much appreciated. I will take care of backporting this to releases.
Updated by Upkeep Bot over 1 year ago
- Copied to Backport #67148: reef: crash: Client::_readdir_cache_cb(dir_result_t*, int (*)(void*, dirent*, ceph_statx*, long, Inode*), void*, int, bool) added
Updated by Upkeep Bot over 1 year ago
- Copied to Backport #67149: squid: crash: Client::_readdir_cache_cb(dir_result_t*, int (*)(void*, dirent*, ceph_statx*, long, Inode*), void*, int, bool) added
Updated by Upkeep Bot over 1 year ago
- Copied to Backport #67150: quincy: crash: Client::_readdir_cache_cb(dir_result_t*, int (*)(void*, dirent*, ceph_statx*, long, Inode*), void*, int, bool) added
Updated by Upkeep Bot over 1 year ago
- Tags (freeform) set to backport_processed
Updated by Upkeep Bot 8 months ago
- Status changed from Pending Backport to Resolved
- Upkeep Timestamp set to 2025-07-09T16:45:06+00:00
Updated by Upkeep Bot 8 months ago
- Merge Commit set to 0e0128522115c212ef3f47f595ae36c85b869ef9
- Fixed In set to v19.3.0-3444-g0e01285221
- Upkeep Timestamp changed from 2025-07-09T16:45:06+00:00 to 2025-08-02T05:03:48+00:00
Updated by Upkeep Bot 5 months ago
- Released In set to v20.2.0~2490
- Upkeep Timestamp changed from 2025-08-02T05:03:48+00:00 to 2025-11-01T01:36:22+00:00