Project

General

Profile

Actions

Bug #56288

closed

crash: Client::_readdir_cache_cb(dir_result_t*, int (*)(void*, dirent*, ceph_statx*, long, Inode*), void*, int, bool)

Added by Telemetry Bot over 3 years ago. Updated 5 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Correctness/Safety
Target version:
% Done:

0%

Source:
Telemetry
Backport:
quincy,reef,squid
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Client
Labels (FS):
crash
Pull request ID:
Tags (freeform):
backport_processed
Fixed In:
v19.3.0-3444-g0e01285221
Released In:
v20.2.0~2490
Upkeep Timestamp:
2025-11-01T01:36:22+00:00

Description

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=aeaa2b6c5a82bba2b2f33885dde87b61d62ebb22b579a85f5fbf5c3eedab2c67

Sanitized backtrace:

    Client::_readdir_cache_cb(dir_result_t*, int (*)(void*, dirent*, ceph_statx*, long, Inode*), void*, int, bool)
    Client::readdir_r_cb(dir_result_t*, int (*)(void*, dirent*, ceph_statx*, long, Inode*), void*, unsigned int, unsigned int, bool)

Crash dump sample:
{
    "backtrace": [
        "__kernel_rt_sigreturn()",
        "(Client::_readdir_cache_cb(dir_result_t*, int (*)(void*, dirent*, ceph_statx*, long, Inode*), void*, int, bool)+0x37c) [0xaaaaca9854b4]",
        "(Client::readdir_r_cb(dir_result_t*, int (*)(void*, dirent*, ceph_statx*, long, Inode*), void*, unsigned int, unsigned int, bool)+0x9f0) [0xaaaaca9865d0]",
        "ceph-fuse(+0x9f37c) [0xaaaaca93737c]",
        "/lib/aarch64-linux-gnu/libfuse.so.2(+0x14068) [0xffffba485068]",
        "/lib/aarch64-linux-gnu/libfuse.so.2(+0x15064) [0xffffba486064]",
        "/lib/aarch64-linux-gnu/libfuse.so.2(+0x12158) [0xffffba483158]",
        "/lib/aarch64-linux-gnu/libpthread.so.0(+0x751c) [0xffffb997b51c]",
        "/lib/aarch64-linux-gnu/libc.so.6(+0xd122c) [0xffffb96b522c]" 
    ],
    "ceph_version": "17.2.0",
    "crash_id": "2022-05-06T09:38:25.862101Z_96a4e7ca-adf6-4a56-acd2-0731016b50ca",
    "entity_name": "client.779720a56c617f4713d46a0389a5f0b5c78d2903",
    "os_id": "ubuntu",
    "os_name": "Ubuntu",
    "os_version": "20.04.4 LTS (Focal Fossa)",
    "os_version_id": "20.04",
    "process_name": "ceph-fuse",
    "stack_sig": "915ad90d349e6a333f04e0504a80cafc8ba5777680640de0597c7b0275ea1dc4",
    "timestamp": "2022-05-06T09:38:25.862101Z",
    "utsname_machine": "aarch64",
    "utsname_release": "5.4.0-1059-raspi",
    "utsname_sysname": "Linux",
    "utsname_version": "#67-Ubuntu SMP PREEMPT Mon Apr 11 14:16:01 UTC 2022" 
}


Files

log.gz (219 KB) log.gz ceph client logs from upstream user Milind Changire, 07/17/2023 03:41 PM

Related issues 3 (0 open3 closed)

Copied to CephFS - Backport #67148: reef: crash: Client::_readdir_cache_cb(dir_result_t*, int (*)(void*, dirent*, ceph_statx*, long, Inode*), void*, int, bool)ResolvedVenky ShankarActions
Copied to CephFS - Backport #67149: squid: crash: Client::_readdir_cache_cb(dir_result_t*, int (*)(void*, dirent*, ceph_statx*, long, Inode*), void*, int, bool)ResolvedVenky ShankarActions
Copied to CephFS - Backport #67150: quincy: crash: Client::_readdir_cache_cb(dir_result_t*, int (*)(void*, dirent*, ceph_statx*, long, Inode*), void*, int, bool)ResolvedVenky ShankarActions
Actions #1

Updated by Telemetry Bot over 3 years ago

  • Crash signature (v1) updated (diff)
  • Crash signature (v2) updated (diff)
  • Affected Versions v17.2.0 added
Actions #2

Updated by Venky Shankar over 3 years ago

  • Category set to Correctness/Safety
  • Status changed from New to Triaged
  • Assignee set to Venky Shankar
  • Target version set to v18.0.0
  • Backport set to pacific,quincy
  • Component(FS) Client added
  • Labels (FS) crash added
Actions #4

Updated by Milind Changire over 2 years ago

Actions #5

Updated by Venky Shankar over 2 years ago

Milind Changire wrote:

Similar crash report in ceph-users mailing list

Thanks for the logs. I'll have a look.

Actions #6

Updated by Venky Shankar over 2 years ago

  • Backport changed from pacific,quincy to reef,quincy,pacific
Actions #7

Updated by Milind Changire over 2 years ago

Venky,
The upstream user has also sent across debug (level 20) logs for ceph-fuse as well as mds.
Unfortunately, the size is more tham 1000KB and so they can't be attached to the tracker.
Let me know when you need them.

Actions #8

Updated by Venky Shankar over 2 years ago

Milind Changire wrote:

Venky,
The upstream user has also sent across debug (level 20) logs for ceph-fuse as well as mds.
Unfortunately, the size is more tham 1000KB and so they can't be attached to the tracker.
Let me know when you need them.

Would help. Please upload using ceph-post-file.

Actions #9

Updated by Patrick Donnelly over 2 years ago

  • Target version deleted (v18.0.0)
Actions #10

Updated by lei liu about 2 years ago

We recently encountered a similar issue, may I ask if there is a solution?

Actions #11

Updated by Venky Shankar about 2 years ago

lei liu wrote:

We recently encountered a similar issue, may I ask if there is a solution?

Which version was this hit in? 17.2.*?

Actions #12

Updated by Venky Shankar about 2 years ago

Venky Shankar wrote:

lei liu wrote:

We recently encountered a similar issue, may I ask if there is a solution?

Which version was this hit in? 17.2.*?

Is there a coredump that can be shared?

Actions #13

Updated by Venky Shankar about 2 years ago

lei liu wrote:

We recently encountered a similar issue, may I ask if there is a solution?

For now, restart ceph-fuse and that should get you past this.

Actions #14

Updated by Venky Shankar about 2 years ago

I think this happens when there are concurrent lookups and deletes under a directory. _readdir_cache_cb() has code like

    int idx = pd - dir->readdir_cache.begin();
    if (dn->inode->is_dir() && cct->_conf->client_dirsize_rbytes) {
      mask |= CEPH_STAT_RSTAT;
    }
    int r = _getattr(dn->inode, mask, dirp->perms);
    if (r < 0)
      return r;

    // the content of readdir_cache may change after _getattr(), so pd may be invalid iterator
    pd = dir->readdir_cache.begin() + idx;
    if (pd >= dir->readdir_cache.end() || *pd != dn)
      return -CEPHFS_EAGAIN;

... to handle the cache getting modified when the client_lock gets dropped in _getattr() (via add_update_inode()). I'll reproduce this and see what's going on.

If you have a core to share, it'll speed you debugging. Thx!

Actions #15

Updated by lei liu about 2 years ago

Venky Shankar wrote:

lei liu wrote:

We recently encountered a similar issue, may I ask if there is a solution?

Which version was this hit in? 17.2.*?

version: ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable);

For some reason, we did not save the core file. Currently, we only have some dmesg information:* ganesha.nfsd24416: segfault at 90 ip 00007f79f8731227 sp 00007f78037f24d0 error 4 in libcephfs.so.2.0.0[7f79f86d3000+e2000]*. They seem to point to [https://github.com/ceph/ceph/blob/v15.2.17/src/client/Client.cc#L8171]. In our environment, this issue has occurred three times in the past three weeks, but I have not yet found a way to reproduce this issue.

Actions #16

Updated by Venky Shankar about 2 years ago

lei liu wrote:

Venky Shankar wrote:

lei liu wrote:

We recently encountered a similar issue, may I ask if there is a solution?

Which version was this hit in? 17.2.*?

version: ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable);

For some reason, we did not save the core file. Currently, we only have some dmesg information:* ganesha.nfsd24416: segfault at 90 ip 00007f79f8731227 sp 00007f78037f24d0 error 4 in libcephfs.so.2.0.0[7f79f86d3000+e2000]*. They seem to point to [https://github.com/ceph/ceph/blob/v15.2.17/src/client/Client.cc#L8171]. In our environment, this issue has occurred three times in the past three weeks, but I have not yet found a way to reproduce this issue.

Ok, yeh. I doubted that part (mentioned in note-14).

Actions #17

Updated by Venky Shankar almost 2 years ago

I haven't been unable to reproduce this with the main branch. If possible, please collect ceph-mds coredump and attach it to this tracker.

Actions #18

Updated by Venky Shankar almost 2 years ago

So, for some reason this part of the code

if (pd >= dir->readdir_cache.end() || *pd != dn)
      return -CEPHFS_EAGAIN;

especially dereferencing @pd` is causing the crash. I don't have a core to validate that, but none of the other checks seem too wild to be causing this.

Actions #19

Updated by lei liu almost 2 years ago

Venky Shankar wrote in #note-18:

So, for some reason this part of the code

[...]

especially dereferencing @pd` is causing the crash. I don't have a core to validate that, but none of the other checks seem too wild to be causing this.

thank you, I will send you the core file when it happens again.

Actions #20

Updated by tod chen almost 2 years ago · Edited

maybe i found how to reproduce this case and fix it.

the coredump stack is:

Core was generated by `/usr/bin/ganesha.nfsd -L /jfsexports/logs/ganesha/ganesha.log -f /etc/ganesha/g'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f246b27a227 in Client::_readdir_cache_cb (this=<optimized out>, dirp=0x7f23f5849f80, cb=0x7f246b232ac0 <_readdir_single_dirent_cb(void*, dirent*, ceph_statx*, off_t, Inode*)>, p=0x7f24661f1730,
    caps=341, getref=true) at /usr/include/c++/8/bits/stl_iterator.h:783
783        /usr/include/c++/8/bits/stl_iterator.h: No such file or directory.
[Current thread is 1 (Thread 0x7f24661f5700 (LWP 1338076))]
(gdb) bt
#0  0x00007f246b27a227 in Client::_readdir_cache_cb (this=<optimized out>, dirp=0x7f23f5849f80, cb=0x7f246b232ac0 <_readdir_single_dirent_cb(void*, dirent*, ceph_statx*, off_t, Inode*)>, p=0x7f24661f1730,
    caps=341, getref=true) at /usr/include/c++/8/bits/stl_iterator.h:783
#1  0x00007f246b27bd84 in Client::readdir_r_cb (this=0x7f247694f000, d=<optimized out>, cb=0x7f246b232ac0 <_readdir_single_dirent_cb(void*, dirent*, ceph_statx*, off_t, Inode*)>, p=0x7f24661f1730,
    want=<optimized out>, flags=<optimized out>, getref=true) at ./src/client/Client.cc:8315
#2  0x00007f246b27c39d in Client::readdirplus_r (this=<optimized out>, d=<optimized out>, de=de@entry=0x7f24661f1960, stx=stx@entry=0x7f24661f17d0, want=want@entry=5871, flags=flags@entry=0,
    out=0x7f24661f17b8) at ./src/client/Client.cc:8488
#3  0x00007f246b229b58 in ceph_readdirplus_r (cmount=<optimized out>, dirp=<optimized out>, de=de@entry=0x7f24661f1960, stx=stx@entry=0x7f24661f17d0, want=want@entry=5871, flags=flags@entry=0,
    out=0x7f24661f17b8) at ./src/libcephfs.cc:643
#4  0x00007f246b356b5f in fsal_ceph_readdirplus (dir=<optimized out>, cred=<optimized out>, out=0x7f24661f17b8, flags=0, want=5871, stx=0x7f24661f17d0, de=0x7f24661f1960, dirp=<optimized out>,
    cmount=<optimized out>) at ./src/FSAL/FSAL_CEPH/statx_compat.h:315
#5  ceph_fsal_readdir (dir_pub=<optimized out>, whence=<optimized out>, dir_state=0x7f24661f1b00, cb=0x7f2477e7e090 <mdc_readdir_chunked_cb>, attrmask=9223372036906278894, eof=0x7f24661f1c1f)
    at ./src/FSAL/FSAL_CEPH/handle.c:258
#6  0x00007f2477e786b9 in mdcache_populate_dir_chunk (directory=0x7f23c694b000, whence=<optimized out>, dirent=0x7f24661f1c20, prev_chunk=<optimized out>, eod_met=0x7f24661f1c1f)
    at ./src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:2685
#7  0x00007f2477e79c93 in mdcache_readdir_chunked (directory=0x7f23c694b000, whence=0, dir_state=0x7f24661f1d90, cb=0x7f2477d78d10 <populate_dirent>, attrmask=122830, eod_met=0x7f24661f1e7b)
    at ./src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:3049
#8  0x00007f2477d7a9bf in fsal_readdir (directory=directory@entry=0x7f23c694b038, cookie=<optimized out>, cookie@entry=0, nbfound=nbfound@entry=0x7f24661f1e7c, eod_met=eod_met@entry=0x7f24661f1e7b,
    attrmask=122830, cb=cb@entry=0x7f2477e38190 <nfs4_readdir_callback>, opaque=0x7f24661f1f80) at ./src/FSAL/fsal_helper.c:1394
#9  0x00007f2477e39543 in nfs4_op_readdir (op=0x7f1c4d43e120, data=0x7f23e72e5f00, resp=0x7f2438b86340) at ./src/Protocols/NFS/nfs4_op_readdir.c:666
#10 0x00007f2477e20b26 in process_one_op (data=data@entry=0x7f23e72e5f00, status=status@entry=0x7f24661f214c) at ./src/Protocols/NFS/nfs4_Compound.c:918
#11 0x00007f2477e21f48 in nfs4_Compound (arg=<optimized out>, req=0x7f2438a0f000, res=<optimized out>) at ./src/Protocols/NFS/nfs4_Compound.c:1383
#12 0x00007f2477d9b005 in nfs_rpc_process_request (reqdata=0x7f2438a0f000, retry=<optimized out>) at ./src/MainNFSD/nfs_worker_thread.c:1507
#13 0x00007f2477d1d5ab in svc_request (xprt=0x7f23bbdaf400, xdrs=0x7f233a0d1a80) at ./src/svc_rqst.c:1209
#14 0x00007f2477d1a8b1 in svc_rqst_xprt_task_recv (wpe=<optimized out>) at ./src/svc_rqst.c:1183
#15 0x00007f2477d1b298 in svc_rqst_epoll_loop (wpe=0x7f2476956600) at ./src/svc_rqst.c:1571
#16 0x00007f2477d26748 in work_pool_thread (arg=0x7f24396000a0) at ./src/work_pool.c:183
#17 0x00007f24775d2fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#18 0x00007f247732c06f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

(gdb) f 0
#0  0x00007f246b27a227 in Client::_readdir_cache_cb (this=<optimized out>, dirp=0x7f23f5849f80, cb=0x7f246b232ac0 <_readdir_single_dirent_cb(void*, dirent*, ceph_statx*, off_t, Inode*)>, p=0x7f24661f1730,
    caps=341, getref=true) at /usr/include/c++/8/bits/stl_iterator.h:783
783    in /usr/include/c++/8/bits/stl_iterator.h
(gdb) p dirp->inode
$1 = {px = 0x7f1cda4fa600}
(gdb) p ((Inode *)0x7f1cda4fa600)->dir
$2 = (Dir *) 0x0

I think the process that caused the problem is as follows:

1. Thread A attempts to traverse the directory by readdir, and the subitems in this directory are already complete(I_COMPLETE|I_DIR_ORDERED). Therefore, it reaches _readdir_cache_cb and calls _getattr to make a request to mds. It then waits for mds's response and releases the client_lock lock during the waiting process.(maybe the response from mds is slow for huge stress)

2. The timer thread(tick()) will try to trim_cache every second, so it may progress to

trim_cache -> trim_dentry -> unlink -> close_dir

This process results in releasing all dentries in the directory and release the memory that dirp >inode>dir pointed

void Client::close_dir(Dir *dir)
{
  Inode *in = dir->parent_inode;
  ldout(cct, 15) << __func__ << " dir " << dir << " on " << in << dendl;
  ceph_assert(dir->is_empty());
  ceph_assert(in->dir == dir);
  ceph_assert(in->dentries.size() < 2);     // dirs can't be hard-linked
  if (!in->dentries.empty())
    in->get_first_parent()->put();   // unpin dentry

  delete in->dir;
  in->dir = 0;
  put_inode(in);               // unpin inode
}

3. so when the getattr returns in _readdir_cache_cb(Thread A), the memory that dir pointed is freeed

how to fix it:

    int r = _getattr(dn->inode, mask, dirp->perms);
    if (r < 0)
      return r;

+    if (dirp->inode->dir == NULL) {
+      ldout(cct, 0) << " dir is closed, so we should return" << dendl;
+      return -EAGAIN;
+    }

    // the content of readdir_cache may change after _getattr(), so pd may be invalid iterator    
    pd = dir->readdir_cache.begin() + idx;
    if (pd >= dir->readdir_cache.end() || *pd != dn)
      return -EAGAIN;
Actions #21

Updated by Venky Shankar almost 2 years ago

tod chen wrote in #note-20:

maybe i found how to reproduce this case and fix it.

the coredump stack is:

[...]

I think the process that caused the problem is as follows:

1. Thread A attempts to traverse the directory by readdir, and the subitems in this directory are already complete(I_COMPLETE|I_DIR_ORDERED). Therefore, it reaches _readdir_cache_cb and calls _getattr to make a request to mds. It then waits for mds's response and releases the client_lock lock during the waiting process.(maybe the response from mds is slow for huge stress)

2. The timer thread(tick()) will try to trim_cache every second, so it may progress to

trim_cache -> trim_dentry -> unlink -> close_dir

This process results in releasing all dentries in the directory and release the memory that dirp >inode>dir pointed

[...]

3. so when the getattr returns in _readdir_cache_cb(Thread A), the memory that dir pointed is freeed

I guess you rightly found the issue. Having a coredump to analyze speeds things up greatly. Would you be willing to send a PR?

Actions #22

Updated by tod chen almost 2 years ago

Venky Shankar wrote in #note-21:

tod chen wrote in #note-20:

maybe i found how to reproduce this case and fix it.

the coredump stack is:

[...]

I think the process that caused the problem is as follows:

1. Thread A attempts to traverse the directory by readdir, and the subitems in this directory are already complete(I_COMPLETE|I_DIR_ORDERED). Therefore, it reaches _readdir_cache_cb and calls _getattr to make a request to mds. It then waits for mds's response and releases the client_lock lock during the waiting process.(maybe the response from mds is slow for huge stress)

2. The timer thread(tick()) will try to trim_cache every second, so it may progress to

trim_cache -> trim_dentry -> unlink -> close_dir

This process results in releasing all dentries in the directory and release the memory that dirp >inode>dir pointed

[...]

3. so when the getattr returns in _readdir_cache_cb(Thread A), the memory that dir pointed is freeed

I guess you rightly found the issue. Having a coredump to analyze speeds things up greatly. Would you be willing to send a PR?

ok,i will sned the pr soon

Actions #23

Updated by tod chen over 1 year ago

@Venky Shankar,

hello

this is the pr: https://github.com/ceph/ceph/pull/58216

Actions #24

Updated by Venky Shankar over 1 year ago

  • Status changed from Triaged to Fix Under Review
  • Target version set to v20.0.0
  • Backport changed from reef,quincy,pacific to quincy,reef,squid
  • Pull request ID set to 58216

tod chen wrote in #note-23:

@Venky Shankar,

hello

this is the pr: https://github.com/ceph/ceph/pull/58216

:thumbs-up:

Actions #25

Updated by Venky Shankar over 1 year ago

  • Status changed from Fix Under Review to Pending Backport

Thanks for the fix @hit1944, much appreciated. I will take care of backporting this to releases.

Actions #26

Updated by Upkeep Bot over 1 year ago

  • Copied to Backport #67148: reef: crash: Client::_readdir_cache_cb(dir_result_t*, int (*)(void*, dirent*, ceph_statx*, long, Inode*), void*, int, bool) added
Actions #27

Updated by Upkeep Bot over 1 year ago

  • Copied to Backport #67149: squid: crash: Client::_readdir_cache_cb(dir_result_t*, int (*)(void*, dirent*, ceph_statx*, long, Inode*), void*, int, bool) added
Actions #28

Updated by Upkeep Bot over 1 year ago

  • Copied to Backport #67150: quincy: crash: Client::_readdir_cache_cb(dir_result_t*, int (*)(void*, dirent*, ceph_statx*, long, Inode*), void*, int, bool) added
Actions #29

Updated by Upkeep Bot over 1 year ago

  • Tags (freeform) set to backport_processed
Actions #30

Updated by Upkeep Bot 8 months ago

  • Status changed from Pending Backport to Resolved
  • Upkeep Timestamp set to 2025-07-09T16:45:06+00:00
Actions #31

Updated by Upkeep Bot 8 months ago

  • Merge Commit set to 0e0128522115c212ef3f47f595ae36c85b869ef9
  • Fixed In set to v19.3.0-3444-g0e01285221
  • Upkeep Timestamp changed from 2025-07-09T16:45:06+00:00 to 2025-08-02T05:03:48+00:00
Actions #32

Updated by Upkeep Bot 5 months ago

  • Released In set to v20.2.0~2490
  • Upkeep Timestamp changed from 2025-08-02T05:03:48+00:00 to 2025-11-01T01:36:22+00:00
Actions

Also available in: Atom PDF