Bug #56288: crash: Client::_readdir_cache_cb(dir_result_t*, int (*)(void*, dirent*, ceph_statx*, long, Inode*), void*, int, bool) - CephFS - Ceph

Actions

Copy link

Bug #56288

closed

crash: Client::_readdir_cache_cb(dir_result_t, int ()(void, dirent, ceph_statx, long, Inode), void*, int, bool)

Added by Telemetry Bot over 3 years ago. Updated 5 months ago.

Status:

Resolved

Priority:

Normal

Assignee:

Venky Shankar

Category:

Correctness/Safety

Target version:

Ceph - v20.0.0

% Done:

Source:

Telemetry

Backport:

quincy,reef,squid

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

Ceph - v17.2.0

ceph-qa-suite:

Component(FS):

Client

Labels (FS):

crash

Pull request ID:

58216

Tags (freeform):

backport_processed

Merge Commit:

0e0128522115c212ef3f47f595ae36c85b869ef9

Fixed In:

v19.3.0-3444-g0e01285221

Released In:

v20.2.0~2490

Upkeep Timestamp:

2025-11-01T01:36:22+00:00

Description

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=aeaa2b6c5a82bba2b2f33885dde87b61d62ebb22b579a85f5fbf5c3eedab2c67

Sanitized backtrace:

    Client::_readdir_cache_cb(dir_result_t*, int (*)(void*, dirent*, ceph_statx*, long, Inode*), void*, int, bool)
    Client::readdir_r_cb(dir_result_t*, int (*)(void*, dirent*, ceph_statx*, long, Inode*), void*, unsigned int, unsigned int, bool)

Crash dump sample:

{
    "backtrace": [
        "__kernel_rt_sigreturn()",
        "(Client::_readdir_cache_cb(dir_result_t*, int (*)(void*, dirent*, ceph_statx*, long, Inode*), void*, int, bool)+0x37c) [0xaaaaca9854b4]",
        "(Client::readdir_r_cb(dir_result_t*, int (*)(void*, dirent*, ceph_statx*, long, Inode*), void*, unsigned int, unsigned int, bool)+0x9f0) [0xaaaaca9865d0]",
        "ceph-fuse(+0x9f37c) [0xaaaaca93737c]",
        "/lib/aarch64-linux-gnu/libfuse.so.2(+0x14068) [0xffffba485068]",
        "/lib/aarch64-linux-gnu/libfuse.so.2(+0x15064) [0xffffba486064]",
        "/lib/aarch64-linux-gnu/libfuse.so.2(+0x12158) [0xffffba483158]",
        "/lib/aarch64-linux-gnu/libpthread.so.0(+0x751c) [0xffffb997b51c]",
        "/lib/aarch64-linux-gnu/libc.so.6(+0xd122c) [0xffffb96b522c]" 
    ],
    "ceph_version": "17.2.0",
    "crash_id": "2022-05-06T09:38:25.862101Z_96a4e7ca-adf6-4a56-acd2-0731016b50ca",
    "entity_name": "client.779720a56c617f4713d46a0389a5f0b5c78d2903",
    "os_id": "ubuntu",
    "os_name": "Ubuntu",
    "os_version": "20.04.4 LTS (Focal Fossa)",
    "os_version_id": "20.04",
    "process_name": "ceph-fuse",
    "stack_sig": "915ad90d349e6a333f04e0504a80cafc8ba5777680640de0597c7b0275ea1dc4",
    "timestamp": "2022-05-06T09:38:25.862101Z",
    "utsname_machine": "aarch64",
    "utsname_release": "5.4.0-1059-raspi",
    "utsname_sysname": "Linux",
    "utsname_version": "#67-Ubuntu SMP PREEMPT Mon Apr 11 14:16:01 UTC 2022" 
}

Files

log.gz (219 KB) log.gz

ceph client logs from upstream user

Milind Changire, 07/17/2023 03:41 PM

Related issues 3 (0 open — 3 closed)

Actions

Copy link

Updated by Telemetry Bot over 3 years ago

Crash signature (v1) updated (diff)
Crash signature (v2) updated (diff)
Affected Versions v17.2.0 added

Actions

Copy link

Updated by Venky Shankar over 3 years ago

Category set to Correctness/Safety
Status changed from New to Triaged
Assignee set to Venky Shankar
Target version set to v18.0.0
Backport set to pacific,quincy
Component(FS) Client added
Labels (FS) crash added

Actions

Copy link

Updated by Milind Changire over 2 years ago

Similar crash report in ceph-users mailing list

Actions

Copy link

Updated by Milind Changire over 2 years ago

File log.gz log.gz added

Actions

Copy link

Updated by Venky Shankar over 2 years ago

Milind Changire wrote:

Similar crash report in ceph-users mailing list

Thanks for the logs. I'll have a look.

Actions

Copy link

Updated by Venky Shankar over 2 years ago

Backport changed from pacific,quincy to reef,quincy,pacific

Actions

Copy link

Updated by Milind Changire over 2 years ago

Venky,
The upstream user has also sent across debug (level 20) logs for ceph-fuse as well as mds.
Unfortunately, the size is more tham 1000KB and so they can't be attached to the tracker.
Let me know when you need them.

Actions

Copy link

Updated by Venky Shankar over 2 years ago

Milind Changire wrote:

Venky,
The upstream user has also sent across debug (level 20) logs for ceph-fuse as well as mds.
Unfortunately, the size is more tham 1000KB and so they can't be attached to the tracker.
Let me know when you need them.

Would help. Please upload using ceph-post-file.

Actions

Copy link

Updated by Patrick Donnelly over 2 years ago

Target version deleted (~~v18.0.0~~)

Actions

Copy link

#10

Updated by lei liu about 2 years ago

We recently encountered a similar issue, may I ask if there is a solution?

Actions

Copy link

#11

Updated by Venky Shankar about 2 years ago

lei liu wrote:

We recently encountered a similar issue, may I ask if there is a solution?

Which version was this hit in? 17.2.*?

Actions

Copy link

#12

Updated by Venky Shankar about 2 years ago

Venky Shankar wrote:

lei liu wrote:

We recently encountered a similar issue, may I ask if there is a solution?

Which version was this hit in? 17.2.*?

Is there a coredump that can be shared?

Actions

Copy link

#13

Updated by Venky Shankar about 2 years ago

lei liu wrote:

We recently encountered a similar issue, may I ask if there is a solution?

For now, restart ceph-fuse and that should get you past this.

Actions

Copy link

#14

Updated by Venky Shankar about 2 years ago

I think this happens when there are concurrent lookups and deletes under a directory. _readdir_cache_cb() has code like

    int idx = pd - dir->readdir_cache.begin();
    if (dn->inode->is_dir() && cct->_conf->client_dirsize_rbytes) {
      mask |= CEPH_STAT_RSTAT;
    }
    int r = _getattr(dn->inode, mask, dirp->perms);
    if (r < 0)
      return r;

    // the content of readdir_cache may change after _getattr(), so pd may be invalid iterator
    pd = dir->readdir_cache.begin() + idx;
    if (pd >= dir->readdir_cache.end() || *pd != dn)
      return -CEPHFS_EAGAIN;

... to handle the cache getting modified when the client_lock gets dropped in _getattr() (via add_update_inode()). I'll reproduce this and see what's going on.

If you have a core to share, it'll speed you debugging. Thx!

Actions

Copy link

#15

Updated by lei liu about 2 years ago

Venky Shankar wrote:

lei liu wrote:

We recently encountered a similar issue, may I ask if there is a solution?

Which version was this hit in? 17.2.*?

version: ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable);

For some reason, we did not save the core file. Currently, we only have some dmesg information:* ganesha.nfsd²⁴⁴¹⁶: segfault at 90 ip 00007f79f8731227 sp 00007f78037f24d0 error 4 in libcephfs.so.2.0.0[7f79f86d3000+e2000]*. They seem to point to [https://github.com/ceph/ceph/blob/v15.2.17/src/client/Client.cc#L8171]. In our environment, this issue has occurred three times in the past three weeks, but I have not yet found a way to reproduce this issue.

Actions

Copy link

#16

Updated by Venky Shankar about 2 years ago

lei liu wrote:

Venky Shankar wrote:

lei liu wrote:

We recently encountered a similar issue, may I ask if there is a solution?

Which version was this hit in? 17.2.*?

version: ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable);

For some reason, we did not save the core file. Currently, we only have some dmesg information:* ganesha.nfsd²⁴⁴¹⁶: segfault at 90 ip 00007f79f8731227 sp 00007f78037f24d0 error 4 in libcephfs.so.2.0.0[7f79f86d3000+e2000]*. They seem to point to [https://github.com/ceph/ceph/blob/v15.2.17/src/client/Client.cc#L8171]. In our environment, this issue has occurred three times in the past three weeks, but I have not yet found a way to reproduce this issue.

Ok, yeh. I doubted that part (mentioned in note-14).

Actions

Copy link

#17

Updated by Venky Shankar almost 2 years ago

I haven't been unable to reproduce this with the main branch. If possible, please collect ceph-mds coredump and attach it to this tracker.

Actions

Copy link

#18

Updated by Venky Shankar almost 2 years ago

So, for some reason this part of the code

if (pd >= dir->readdir_cache.end() || *pd != dn)
      return -CEPHFS_EAGAIN;

especially dereferencing @pd` is causing the crash. I don't have a core to validate that, but none of the other checks seem too wild to be causing this.

Actions

Copy link

#19

Updated by lei liu almost 2 years ago

Venky Shankar wrote in #note-18:

So, for some reason this part of the code

[...]

especially dereferencing @pd` is causing the crash. I don't have a core to validate that, but none of the other checks seem too wild to be causing this.

thank you, I will send you the core file when it happens again.

Actions

Copy link

#20

Updated by tod chen almost 2 years ago · Edited

maybe i found how to reproduce this case and fix it.

the coredump stack is:

Core was generated by `/usr/bin/ganesha.nfsd -L /jfsexports/logs/ganesha/ganesha.log -f /etc/ganesha/g'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f246b27a227 in Client::_readdir_cache_cb (this=<optimized out>, dirp=0x7f23f5849f80, cb=0x7f246b232ac0 <_readdir_single_dirent_cb(void*, dirent*, ceph_statx*, off_t, Inode*)>, p=0x7f24661f1730,
    caps=341, getref=true) at /usr/include/c++/8/bits/stl_iterator.h:783
783        /usr/include/c++/8/bits/stl_iterator.h: No such file or directory.
[Current thread is 1 (Thread 0x7f24661f5700 (LWP 1338076))]
(gdb) bt
#0  0x00007f246b27a227 in Client::_readdir_cache_cb (this=<optimized out>, dirp=0x7f23f5849f80, cb=0x7f246b232ac0 <_readdir_single_dirent_cb(void*, dirent*, ceph_statx*, off_t, Inode*)>, p=0x7f24661f1730,
    caps=341, getref=true) at /usr/include/c++/8/bits/stl_iterator.h:783
#1  0x00007f246b27bd84 in Client::readdir_r_cb (this=0x7f247694f000, d=<optimized out>, cb=0x7f246b232ac0 <_readdir_single_dirent_cb(void*, dirent*, ceph_statx*, off_t, Inode*)>, p=0x7f24661f1730,
    want=<optimized out>, flags=<optimized out>, getref=true) at ./src/client/Client.cc:8315
#2  0x00007f246b27c39d in Client::readdirplus_r (this=<optimized out>, d=<optimized out>, de=de@entry=0x7f24661f1960, stx=stx@entry=0x7f24661f17d0, want=want@entry=5871, flags=flags@entry=0,
    out=0x7f24661f17b8) at ./src/client/Client.cc:8488
#3  0x00007f246b229b58 in ceph_readdirplus_r (cmount=<optimized out>, dirp=<optimized out>, de=de@entry=0x7f24661f1960, stx=stx@entry=0x7f24661f17d0, want=want@entry=5871, flags=flags@entry=0,
    out=0x7f24661f17b8) at ./src/libcephfs.cc:643
#4  0x00007f246b356b5f in fsal_ceph_readdirplus (dir=<optimized out>, cred=<optimized out>, out=0x7f24661f17b8, flags=0, want=5871, stx=0x7f24661f17d0, de=0x7f24661f1960, dirp=<optimized out>,
    cmount=<optimized out>) at ./src/FSAL/FSAL_CEPH/statx_compat.h:315
#5  ceph_fsal_readdir (dir_pub=<optimized out>, whence=<optimized out>, dir_state=0x7f24661f1b00, cb=0x7f2477e7e090 <mdc_readdir_chunked_cb>, attrmask=9223372036906278894, eof=0x7f24661f1c1f)
    at ./src/FSAL/FSAL_CEPH/handle.c:258
#6  0x00007f2477e786b9 in mdcache_populate_dir_chunk (directory=0x7f23c694b000, whence=<optimized out>, dirent=0x7f24661f1c20, prev_chunk=<optimized out>, eod_met=0x7f24661f1c1f)
    at ./src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:2685
#7  0x00007f2477e79c93 in mdcache_readdir_chunked (directory=0x7f23c694b000, whence=0, dir_state=0x7f24661f1d90, cb=0x7f2477d78d10 <populate_dirent>, attrmask=122830, eod_met=0x7f24661f1e7b)
    at ./src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:3049
#8  0x00007f2477d7a9bf in fsal_readdir (directory=directory@entry=0x7f23c694b038, cookie=<optimized out>, cookie@entry=0, nbfound=nbfound@entry=0x7f24661f1e7c, eod_met=eod_met@entry=0x7f24661f1e7b,
    attrmask=122830, cb=cb@entry=0x7f2477e38190 <nfs4_readdir_callback>, opaque=0x7f24661f1f80) at ./src/FSAL/fsal_helper.c:1394
#9  0x00007f2477e39543 in nfs4_op_readdir (op=0x7f1c4d43e120, data=0x7f23e72e5f00, resp=0x7f2438b86340) at ./src/Protocols/NFS/nfs4_op_readdir.c:666
#10 0x00007f2477e20b26 in process_one_op (data=data@entry=0x7f23e72e5f00, status=status@entry=0x7f24661f214c) at ./src/Protocols/NFS/nfs4_Compound.c:918
#11 0x00007f2477e21f48 in nfs4_Compound (arg=<optimized out>, req=0x7f2438a0f000, res=<optimized out>) at ./src/Protocols/NFS/nfs4_Compound.c:1383
#12 0x00007f2477d9b005 in nfs_rpc_process_request (reqdata=0x7f2438a0f000, retry=<optimized out>) at ./src/MainNFSD/nfs_worker_thread.c:1507
#13 0x00007f2477d1d5ab in svc_request (xprt=0x7f23bbdaf400, xdrs=0x7f233a0d1a80) at ./src/svc_rqst.c:1209
#14 0x00007f2477d1a8b1 in svc_rqst_xprt_task_recv (wpe=<optimized out>) at ./src/svc_rqst.c:1183
#15 0x00007f2477d1b298 in svc_rqst_epoll_loop (wpe=0x7f2476956600) at ./src/svc_rqst.c:1571
#16 0x00007f2477d26748 in work_pool_thread (arg=0x7f24396000a0) at ./src/work_pool.c:183
#17 0x00007f24775d2fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#18 0x00007f247732c06f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

(gdb) f 0
#0  0x00007f246b27a227 in Client::_readdir_cache_cb (this=<optimized out>, dirp=0x7f23f5849f80, cb=0x7f246b232ac0 <_readdir_single_dirent_cb(void*, dirent*, ceph_statx*, off_t, Inode*)>, p=0x7f24661f1730,
    caps=341, getref=true) at /usr/include/c++/8/bits/stl_iterator.h:783
783    in /usr/include/c++/8/bits/stl_iterator.h
(gdb) p dirp->inode
$1 = {px = 0x7f1cda4fa600}
(gdb) p ((Inode *)0x7f1cda4fa600)->dir
$2 = (Dir *) 0x0

I think the process that caused the problem is as follows:

1. Thread A attempts to traverse the directory by readdir, and the subitems in this directory are already complete(I_COMPLETE|I_DIR_ORDERED). Therefore, it reaches _readdir_cache_cb and calls _getattr to make a request to mds. It then waits for mds's response and releases the client_lock lock during the waiting process.(maybe the response from mds is slow for huge stress)

2. The timer thread(tick()) will try to trim_cache every second, so it may progress to

trim_cache -> trim_dentry -> unlink -> close_dir

This process results in releasing all dentries in the directory and release the memory that dirp ~~>inode~~>dir pointed

void Client::close_dir(Dir *dir)
{
  Inode *in = dir->parent_inode;
  ldout(cct, 15) << __func__ << " dir " << dir << " on " << in << dendl;
  ceph_assert(dir->is_empty());
  ceph_assert(in->dir == dir);
  ceph_assert(in->dentries.size() < 2);     // dirs can't be hard-linked
  if (!in->dentries.empty())
    in->get_first_parent()->put();   // unpin dentry

  delete in->dir;
  in->dir = 0;
  put_inode(in);               // unpin inode
}

3. so when the getattr returns in _readdir_cache_cb(Thread A), the memory that dir pointed is freeed

how to fix it:

    int r = _getattr(dn->inode, mask, dirp->perms);
    if (r < 0)
      return r;

+    if (dirp->inode->dir == NULL) {
+      ldout(cct, 0) << " dir is closed, so we should return" << dendl;
+      return -EAGAIN;
+    }

    // the content of readdir_cache may change after _getattr(), so pd may be invalid iterator    
    pd = dir->readdir_cache.begin() + idx;
    if (pd >= dir->readdir_cache.end() || *pd != dn)
      return -EAGAIN;

Actions

Copy link

#21

Updated by Venky Shankar almost 2 years ago

tod chen wrote in #note-20:

maybe i found how to reproduce this case and fix it.

the coredump stack is:

[...]

I think the process that caused the problem is as follows:

1. Thread A attempts to traverse the directory by readdir, and the subitems in this directory are already complete(I_COMPLETE|I_DIR_ORDERED). Therefore, it reaches _readdir_cache_cb and calls _getattr to make a request to mds. It then waits for mds's response and releases the client_lock lock during the waiting process.(maybe the response from mds is slow for huge stress)

2. The timer thread(tick()) will try to trim_cache every second, so it may progress to

trim_cache -> trim_dentry -> unlink -> close_dir

This process results in releasing all dentries in the directory and release the memory that dirp ~~>inode~~>dir pointed

[...]

3. so when the getattr returns in _readdir_cache_cb(Thread A), the memory that dir pointed is freeed

I guess you rightly found the issue. Having a coredump to analyze speeds things up greatly. Would you be willing to send a PR?

Actions

Copy link

#22

Updated by tod chen almost 2 years ago

Venky Shankar wrote in #note-21:

tod chen wrote in #note-20:

maybe i found how to reproduce this case and fix it.

the coredump stack is:

[...]

I think the process that caused the problem is as follows:

1. Thread A attempts to traverse the directory by readdir, and the subitems in this directory are already complete(I_COMPLETE|I_DIR_ORDERED). Therefore, it reaches _readdir_cache_cb and calls _getattr to make a request to mds. It then waits for mds's response and releases the client_lock lock during the waiting process.(maybe the response from mds is slow for huge stress)

2. The timer thread(tick()) will try to trim_cache every second, so it may progress to

trim_cache -> trim_dentry -> unlink -> close_dir

This process results in releasing all dentries in the directory and release the memory that dirp ~~>inode~~>dir pointed

[...]

3. so when the getattr returns in _readdir_cache_cb(Thread A), the memory that dir pointed is freeed

I guess you rightly found the issue. Having a coredump to analyze speeds things up greatly. Would you be willing to send a PR?

ok，i will sned the pr soon

Actions

Copy link

#23

Updated by tod chen over 1 year ago

@Venky Shankar,

hello

this is the pr: https://github.com/ceph/ceph/pull/58216

Actions

Copy link

#24

Updated by Venky Shankar over 1 year ago

Status changed from Triaged to Fix Under Review
Target version set to v20.0.0
Backport changed from reef,quincy,pacific to quincy,reef,squid
Pull request ID set to 58216

tod chen wrote in #note-23:

@Venky Shankar,

hello

this is the pr: https://github.com/ceph/ceph/pull/58216

:thumbs-up:

Actions

Copy link

#25

Updated by Venky Shankar over 1 year ago

Status changed from Fix Under Review to Pending Backport

Thanks for the fix @hit1944, much appreciated. I will take care of backporting this to releases.

Actions

Copy link

#26

Updated by Upkeep Bot over 1 year ago

Copied to Backport #67148: reef: crash: Client::_readdir_cache_cb(dir_result_t*, int (*)(void*, dirent*, ceph_statx*, long, Inode*), void*, int, bool) added

Actions

Copy link

#27

Updated by Upkeep Bot over 1 year ago

Copied to Backport #67149: squid: crash: Client::_readdir_cache_cb(dir_result_t*, int (*)(void*, dirent*, ceph_statx*, long, Inode*), void*, int, bool) added

Actions

Copy link

#28

Updated by Upkeep Bot over 1 year ago

Copied to Backport #67150: quincy: crash: Client::_readdir_cache_cb(dir_result_t*, int (*)(void*, dirent*, ceph_statx*, long, Inode*), void*, int, bool) added

Actions

Copy link

#29

Updated by Upkeep Bot over 1 year ago

Tags (freeform) set to backport_processed

Actions

Copy link

#30

Updated by Upkeep Bot 8 months ago

Status changed from Pending Backport to Resolved
Upkeep Timestamp set to 2025-07-09T16:45:06+00:00

Actions

Copy link

#31

Updated by Upkeep Bot 8 months ago

Merge Commit set to 0e0128522115c212ef3f47f595ae36c85b869ef9
Fixed In set to v19.3.0-3444-g0e01285221
Upkeep Timestamp changed from 2025-07-09T16:45:06+00:00 to 2025-08-02T05:03:48+00:00

Actions

Copy link

#32

Updated by Upkeep Bot 5 months ago

Released In set to v20.2.0~2490
Upkeep Timestamp changed from 2025-08-02T05:03:48+00:00 to 2025-11-01T01:36:22+00:00

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » CephFS

Tags

Custom queries

Bug #56288

crash: Client::_readdir_cache_cb(dir_result_t, int ()(void, dirent, ceph_statx, long, Inode), void*, int, bool)

Updated by Telemetry Bot over 3 years ago

Updated by Venky Shankar over 3 years ago

Updated by Milind Changire over 2 years ago

Updated by Milind Changire over 2 years ago

Updated by Venky Shankar over 2 years ago

Updated by Venky Shankar over 2 years ago

Updated by Milind Changire over 2 years ago

Updated by Venky Shankar over 2 years ago

Updated by Patrick Donnelly over 2 years ago

Updated by lei liu about 2 years ago

Updated by Venky Shankar about 2 years ago

Updated by Venky Shankar about 2 years ago

Updated by Venky Shankar about 2 years ago

Updated by Venky Shankar about 2 years ago

Updated by lei liu about 2 years ago

Updated by Venky Shankar about 2 years ago

Updated by Venky Shankar almost 2 years ago

Updated by Venky Shankar almost 2 years ago

Updated by lei liu almost 2 years ago

Updated by tod chen almost 2 years ago · Edited

Updated by Venky Shankar almost 2 years ago

Updated by tod chen almost 2 years ago

Updated by tod chen over 1 year ago

Updated by Venky Shankar over 1 year ago

Updated by Venky Shankar over 1 year ago

Updated by Upkeep Bot over 1 year ago

Updated by Upkeep Bot over 1 year ago

Updated by Upkeep Bot over 1 year ago

Updated by Upkeep Bot over 1 year ago

Updated by Upkeep Bot 8 months ago

Updated by Upkeep Bot 8 months ago

Updated by Upkeep Bot 5 months ago