Project

General

Profile

Actions

Bug #69953

open

mds: segmentation faults in recent QA

Added by Patrick Donnelly about 1 year ago. Updated about 2 months ago.

Status:
Pending Backport
Priority:
Immediate
Assignee:
Category:
Correctness/Safety
Target version:
% Done:

0%

Source:
Q/A
Backport:
tentacle,squid
Regression:
No
Severity:
1 - critical
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
crash
Pull request ID:
Tags (freeform):
temp-assign backport_processed
Fixed In:
v20.0.0-1424-g1a947b3b12
Released In:
v20.2.0~584
Upkeep Timestamp:
2025-11-01T01:00:33+00:00

Description

/a/teuthology-2025-02-01_20:24:16-fs-main-distro-default-smithi$ $ grep Segmentation */teu*
8108365/teuthology.log:2025-02-05T21:58:35.495 INFO:journalctl@ceph.mds.f.smithi196.stdout:Feb 05 21:58:35 smithi196 ceph-7dc4dc16-e409-11ef-bb7f-bd4984dce30f-mds-f[68126]: *** Caught signal (Segmentation fault) **
8108365/teuthology.log:2025-02-05T21:58:35.496 INFO:journalctl@ceph.mds.f.smithi196.stdout:Feb 05 21:58:35 smithi196 ceph-7dc4dc16-e409-11ef-bb7f-bd4984dce30f-mds-f[68126]: 2025-02-05T21:58:35.201+0000 7f9620957640 -1 *** Caught signal (Segmentation fault) **
8108365/teuthology.log:2025-02-05T21:58:35.497 INFO:journalctl@ceph.mds.f.smithi196.stdout:Feb 05 21:58:35 smithi196 ceph-7dc4dc16-e409-11ef-bb7f-bd4984dce30f-mds-f[68126]:      0> 2025-02-05T21:58:35.201+0000 7f9620957640 -1 *** Caught signal (Segmentation fault) **
8108365/teuthology.log:2025-02-05T21:58:35.498 INFO:journalctl@ceph.mds.f.smithi196.stdout:Feb 05 21:58:35 smithi196 ceph-7dc4dc16-e409-11ef-bb7f-bd4984dce30f-mds-f[68126]:      0> 2025-02-05T21:58:35.201+0000 7f9620957640 -1 *** Caught signal (Segmentation fault) **
8108365/teuthology.log:2025-02-05T22:01:35.418 INFO:journalctl@ceph.mds.f.smithi196.stdout:Feb 05 22:01:35 smithi196 ceph-7dc4dc16-e409-11ef-bb7f-bd4984dce30f-mds-f[71094]: *** Caught signal (Segmentation fault) **
8108365/teuthology.log:2025-02-05T22:01:35.419 INFO:journalctl@ceph.mds.f.smithi196.stdout:Feb 05 22:01:35 smithi196 ceph-7dc4dc16-e409-11ef-bb7f-bd4984dce30f-mds-f[71094]: 2025-02-05T22:01:35.064+0000 7f8a73e5b640 -1 *** Caught signal (Segmentation fault) **
8108365/teuthology.log:2025-02-05T22:01:35.420 INFO:journalctl@ceph.mds.f.smithi196.stdout:Feb 05 22:01:35 smithi196 ceph-7dc4dc16-e409-11ef-bb7f-bd4984dce30f-mds-f[71094]:      0> 2025-02-05T22:01:35.064+0000 7f8a73e5b640 -1 *** Caught signal (Segmentation fault) **
8108365/teuthology.log:2025-02-05T22:01:35.421 INFO:journalctl@ceph.mds.f.smithi196.stdout:Feb 05 22:01:35 smithi196 ceph-7dc4dc16-e409-11ef-bb7f-bd4984dce30f-mds-f[71094]:      0> 2025-02-05T22:01:35.064+0000 7f8a73e5b640 -1 *** Caught signal (Segmentation fault) **
8108365/teuthology.log:2025-02-05T22:03:59.167 INFO:journalctl@ceph.mds.f.smithi196.stdout:Feb 05 22:03:58 smithi196 ceph-7dc4dc16-e409-11ef-bb7f-bd4984dce30f-mds-f[71936]: *** Caught signal (Segmentation fault) **
8108365/teuthology.log:2025-02-05T22:03:59.168 INFO:journalctl@ceph.mds.f.smithi196.stdout:Feb 05 22:03:58 smithi196 ceph-7dc4dc16-e409-11ef-bb7f-bd4984dce30f-mds-f[71936]: 2025-02-05T22:03:58.897+0000 7f751e9eb640 -1 *** Caught signal (Segmentation fault) **
8108365/teuthology.log:2025-02-05T22:03:59.169 INFO:journalctl@ceph.mds.f.smithi196.stdout:Feb 05 22:03:58 smithi196 ceph-7dc4dc16-e409-11ef-bb7f-bd4984dce30f-mds-f[71936]:      0> 2025-02-05T22:03:58.897+0000 7f751e9eb640 -1 *** Caught signal (Segmentation fault) **
8108365/teuthology.log:2025-02-05T22:03:59.171 INFO:journalctl@ceph.mds.f.smithi196.stdout:Feb 05 22:03:58 smithi196 ceph-7dc4dc16-e409-11ef-bb7f-bd4984dce30f-mds-f[71936]:      0> 2025-02-05T22:03:58.897+0000 7f751e9eb640 -1 *** Caught signal (Segmentation fault) **
8108365/teuthology.log:2025-02-05T22:04:17.099 INFO:journalctl@ceph.mds.l.smithi196.stdout:Feb 05 22:04:16 smithi196 ceph-7dc4dc16-e409-11ef-bb7f-bd4984dce30f-mds-l[67851]: *** Caught signal (Segmentation fault) **
8108365/teuthology.log:2025-02-05T22:04:17.101 INFO:journalctl@ceph.mds.l.smithi196.stdout:Feb 05 22:04:16 smithi196 ceph-7dc4dc16-e409-11ef-bb7f-bd4984dce30f-mds-l[67851]: 2025-02-05T22:04:16.693+0000 7f18cb072640 -1 *** Caught signal (Segmentation fault) **
8108365/teuthology.log:2025-02-05T22:04:17.102 INFO:journalctl@ceph.mds.l.smithi196.stdout:Feb 05 22:04:16 smithi196 ceph-7dc4dc16-e409-11ef-bb7f-bd4984dce30f-mds-l[67851]:      0> 2025-02-05T22:04:16.693+0000 7f18cb072640 -1 *** Caught signal (Segmentation fault) **
8108365/teuthology.log:2025-02-05T22:04:17.104 INFO:journalctl@ceph.mds.l.smithi196.stdout:Feb 05 22:04:16 smithi196 ceph-7dc4dc16-e409-11ef-bb7f-bd4984dce30f-mds-l[67851]:      0> 2025-02-05T22:04:16.693+0000 7f18cb072640 -1 *** Caught signal (Segmentation fault) **
8108365/teuthology.log:2025-02-05T22:06:05.167 INFO:journalctl@ceph.mds.f.smithi196.stdout:Feb 05 22:06:04 smithi196 ceph-7dc4dc16-e409-11ef-bb7f-bd4984dce30f-mds-f[72705]: *** Caught signal (Segmentation fault) **
8108365/teuthology.log:2025-02-05T22:08:59.667 INFO:journalctl@ceph.mds.f.smithi196.stdout:Feb 05 22:08:59 smithi196 ceph-7dc4dc16-e409-11ef-bb7f-bd4984dce30f-mds-f[73220]: *** Caught signal (Segmentation fault) **
8108376/teuthology.log:2025-02-08T10:25:08.173 INFO:journalctl@ceph.mds.h.smithi022.stdout:Feb 08 10:25:07 smithi022 ceph-1c7f3ed2-e603-11ef-bb7f-bd4984dce30f-mds-h[67310]: *** Caught signal (Segmentation fault) **
8108380/teuthology.log:2025-02-08T10:20:08.770 INFO:journalctl@ceph.mds.c.smithi126.stdout:Feb 08 10:20:08 smithi126 ceph-dd7d6730-e603-11ef-bb7f-bd4984dce30f-mds-c[68624]: *** Caught signal (Segmentation fault) **
8108380/teuthology.log:2025-02-08T10:20:08.771 INFO:journalctl@ceph.mds.c.smithi126.stdout:Feb 08 10:20:08 smithi126 ceph-dd7d6730-e603-11ef-bb7f-bd4984dce30f-mds-c[68624]: 2025-02-08T10:20:08.392+0000 7f5ff818f640 -1 *** Caught signal (Segmentation fault) **
8108380/teuthology.log:2025-02-08T10:20:08.772 INFO:journalctl@ceph.mds.c.smithi126.stdout:Feb 08 10:20:08 smithi126 ceph-dd7d6730-e603-11ef-bb7f-bd4984dce30f-mds-c[68624]:      0> 2025-02-08T10:20:08.392+0000 7f5ff818f640 -1 *** Caught signal (Segmentation fault) **
8108380/teuthology.log:2025-02-08T10:20:08.773 INFO:journalctl@ceph.mds.c.smithi126.stdout:Feb 08 10:20:08 smithi126 ceph-dd7d6730-e603-11ef-bb7f-bd4984dce30f-mds-c[68624]:     -1> 2025-02-08T10:20:08.392+0000 7f5ff818f640 -1 *** Caught signal (Segmentation fault) **
8108479/teuthology.log:2025-02-08T12:00:34.246 INFO:journalctl@ceph.mds.f.smithi110.stdout:Feb 08 12:00:33 smithi110 ceph-0b4cc472-e612-11ef-bb7f-bd4984dce30f-mds-f[67785]: *** Caught signal (Segmentation fault) **
8108494/teuthology.log:2025-02-08T12:25:20.381 INFO:journalctl@ceph.mds.e.smithi137.stdout:Feb 08 12:25:20 smithi137 ceph-a6e2ee36-e615-11ef-bb7f-bd4984dce30f-mds-e[67699]: *** Caught signal (Segmentation fault) **
8108514/teuthology.log:2025-02-08T12:55:42.607 INFO:journalctl@ceph.mds.c.smithi139.stdout:Feb 08 12:55:42 smithi139 ceph-4a546862-e619-11ef-bb7f-bd4984dce30f-mds-c[68955]: *** Caught signal (Segmentation fault) **
8108514/teuthology.log:2025-02-08T13:02:47.357 INFO:journalctl@ceph.mds.l.smithi139.stdout:Feb 08 13:02:47 smithi139 ceph-4a546862-e619-11ef-bb7f-bd4984dce30f-mds-l[68399]: *** Caught signal (Segmentation fault) **
8108514/teuthology.log:2025-02-08T13:07:46.358 INFO:journalctl@ceph.mds.l.smithi139.stdout:Feb 08 13:07:45 smithi139 ceph-4a546862-e619-11ef-bb7f-bd4984dce30f-mds-l[73073]: *** Caught signal (Segmentation fault) **
8108514/teuthology.log:2025-02-08T13:07:46.359 INFO:journalctl@ceph.mds.l.smithi139.stdout:Feb 08 13:07:45 smithi139 ceph-4a546862-e619-11ef-bb7f-bd4984dce30f-mds-l[73073]: 2025-02-08T13:07:45.991+0000 7fab76d51640 -1 *** Caught signal (Segmentation fault) **
8108514/teuthology.log:2025-02-08T13:07:46.360 INFO:journalctl@ceph.mds.l.smithi139.stdout:Feb 08 13:07:46 smithi139 ceph-4a546862-e619-11ef-bb7f-bd4984dce30f-mds-l[73073]:      0> 2025-02-08T13:07:45.991+0000 7fab76d51640 -1 *** Caught signal (Segmentation fault) **
8108514/teuthology.log:2025-02-08T13:07:46.361 INFO:journalctl@ceph.mds.l.smithi139.stdout:Feb 08 13:07:46 smithi139 ceph-4a546862-e619-11ef-bb7f-bd4984dce30f-mds-l[73073]:      0> 2025-02-08T13:07:45.991+0000 7fab76d51640 -1 *** Caught signal (Segmentation fault) **
...

https://pulpito.ceph.com/teuthology-2025-02-01_20:24:16-fs-main-distro-default-smithi/

Two issues here:

- the segmentation faults obviously
- teuthology is not reporting the core dumps as the primary failure reason; we should NEVER have segmentation faults and all other failures reasons are simply irrelevant in comparison

Whoever takes this: we need to figure out when these Segmentation faults were introduced; look at older QA runs to help bisect.


Related issues 9 (5 open4 closed)

Related to CephFS - Bug #68914: mds: Segmentation fault in mds_log_replay / MR_Finisher threadTriagedVenky Shankar

Actions
Related to Orchestrator - Bug #70247: Non-zero exit code 1 from systemctl reset-failed ceph-47356c0e-f761-11ef-bb88-bd4984dce30f@mon.aNew

Actions
Related to CephFS - Bug #70624: qa: assertion failure on context completion of C_MDS_RetryRequestResolvedMahesh Mohan

Actions
Related to CephFS - Bug #70761: qa: mds crash and traceback seen when running fs:workload suiteTriagedMahesh Mohan

Actions
Related to CephFS - Bug #70723: qa: AddressSanitizer reports heap-use-after-free in mds-log-replay threadResolvedMilind Changire

Actions
Related to CephFS - Bug #71996: cluster [WRN] Health check failed: 1 failed cephadm daemon(s) (CEPHADM_FAILED_DAEMON)"Need More Info

Actions
Copied to CephFS - Backport #70924: reef: mds: segmentation faults in recent QAQA TestingMahesh MohanActions
Copied to CephFS - Backport #70925: squid: mds: segmentation faults in recent QAResolvedMilind ChangireActions
Copied to CephFS - Backport #72653: tentacle: mds: segmentation faults in recent QAResolvedMilind ChangireActions
Actions #1

Updated by Venky Shankar about 1 year ago

Patrick Donnelly wrote:

[...]

https://pulpito.ceph.com/teuthology-2025-02-01_20:24:16-fs-main-distro-default-smithi/

Two issues here:

- the segmentation faults obviously
- teuthology is not reporting the core dumps as the primary failure reason; we should NEVER have segmentation faults and all other failures reasons are simply irrelevant in comparison

I remember flagging his sometime last year in some forum - we obviously didn't take it seriously :/

FWIW, I always do a

find <run> -name "*core*"

for the fs suite run to avoid such mystery (I haven't done a fs suite run since mid January 2025 though).

Actions #2

Updated by Venky Shankar about 1 year ago

... and here is the crash backtrace

    -7> 2025-02-05T21:58:35.200+0000 7f9620957640 10 mds.0.log _replay: read_pos == write_pos
    -6> 2025-02-05T21:58:35.200+0000 7f9620957640 10 mds.0.log _replay - complete, 58099 events
    -5> 2025-02-05T21:58:35.200+0000 7f9620957640 10 mds.0.log _replay_thread kicking waiters
    -4> 2025-02-05T21:58:35.200+0000 7f9620957640 10 MDSContext::complete: 15C_MDS_BootStart
    -3> 2025-02-05T21:58:35.200+0000 7f9620957640  5 mds.0.0 Finished replaying journal as standby-replay
    -2> 2025-02-05T21:58:35.200+0000 7f9620957640 10 mds.0.0 setting replay timer
    -1> 2025-02-05T21:58:35.200+0000 7f9620957640 10 mds.0.log _replay_thread finish
     0> 2025-02-05T21:58:35.201+0000 7f9620957640 -1 *** Caught signal (Segmentation fault) **
 in thread 7f9620957640 thread_name:mds-log-replay

 ceph version 19.3.0-7232-g44b51db6 (44b51db6813fb456c78075909d800e4ec3b2679f) squid (dev)
 1: /lib64/libc.so.6(+0x3e930) [0x7f962dd40930]
 2: (tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned int, int)+0x93) [0x7f962ec92713]
 3: (tcmalloc::ThreadCache::Cleanup()+0x48) [0x7f962ec92818]
 4: (tcmalloc::ThreadCache::DeleteCache(tcmalloc::ThreadCache*)+0x12) [0x7f962ec92b82]
 5: /lib64/libc.so.6(+0x873c1) [0x7f962dd893c1]
 6: /lib64/libc.so.6(+0x8a166) [0x7f962dd8c166]
 7: /lib64/libc.so.6(+0x10f300) [0x7f962de11300]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

I have a feeling this is somewhat related to the MR_Finisher thread crash.

Actions #3

Updated by Venky Shankar about 1 year ago

  • Related to Bug #68914: mds: Segmentation fault in mds_log_replay / MR_Finisher thread added
Actions #4

Updated by Venky Shankar about 1 year ago

Also, there aren't any core dumps generated from the crash. Now I feel like an idiot doing find .... I mentioned in note-1.

Actions #5

Updated by Milind Changire about 1 year ago

  • Assignee set to Neeraj Pratap Singh
Actions #6

Updated by Venky Shankar about 1 year ago

  • Assignee changed from Neeraj Pratap Singh to Milind Changire
Actions #7

Updated by Afreen Misbah about 1 year ago

  • Related to Bug #69864: use same name for image size in create/update api added
Actions #8

Updated by Afreen Misbah about 1 year ago

  • Related to deleted (Bug #69864: use same name for image size in create/update api)
Actions #9

Updated by Milind Changire about 1 year ago · Edited

I've searched for Scrub error and Segmentation fault in the logs and these are possibly the first runs with Segmentation fault pointing to thread MR_Finisher or md_log_replay:

/teuthology/pdonnell-2022-08-19_22:40:41-fs:workload-wip-pdonnell-testing-20220819.203214-distro-default-smithi/6981777/teuthology.log.gz
2022-08-20T07:19:10.873 INFO:journalctl@ceph.mds.f.smithi099.stdout:Aug 20 07:19:10 smithi099 ceph-1066cfae-2056-11ed-8431-001a4aab830c-mds-f[124400]:  in thread 7f4716fc9700 thread_name:MR_Finisher
---
/teuthology/pdonnell-2022-08-22_18:53:15-fs:workload-wip-pdonnell-testing-20220822.164347-distro-default-smithi/6986012/teuthology.log.gz
2022-08-23T09:05:04.361 INFO:journalctl@ceph.mds.h.smithi055.stdout:Aug 23 09:05:03 smithi055 ceph-ecdc59ce-22c0-11ed-8431-001a4aab830c-mds-h[123274]:  in thread 7f72490d6700 thread_name:md_log_replay

oh, and also no core dumps accompanying the segfaults either.

Actions #10

Updated by Venky Shankar about 1 year ago

Milind Changire wrote in #note-9:

I've searched for Scrub error and Segmentation fault in the logs and these are possibly the first runs with Segmentation fault pointing to thread MR_Finisher or md_log_replay:

[...]

oh, and also no core dumps accompanying the segfaults either.

oh, wow - this is happening from >2 years. So, we obviously need teuthology to fail a run when any ceph daemon crashes and ensure that the coredump survives.

As far as this issue is concerned, do we know the PRs in the batch where the issue was first seen? Its possible that the crash started to happen before that in which case that SHA can be the bisect point.

Actions #11

Updated by Venky Shankar about 1 year ago

Also, let's link a tracker to this for following up with teuthology folks for flagging run failures when any ceph daemon crashes.

Actions #12

Updated by Milind Changire about 1 year ago

Venky Shankar wrote in #note-10:

Milind Changire wrote in #note-9:

I've searched for Scrub error and Segmentation fault in the logs and these are possibly the first runs with Segmentation fault pointing to thread MR_Finisher or md_log_replay:

[...]

oh, and also no core dumps accompanying the segfaults either.

oh, wow - this is happening from >2 years. So, we obviously need teuthology to fail a run when any ceph daemon crashes and ensure that the coredump survives.

As far as this issue is concerned, do we know the PRs in the batch where the issue was first seen? Its possible that the crash started to happen before that in which case that SHA can be the bisect point.

I've started a git bisect with Patrick's PR as the HEAD ... but looks like the build farm has started acting up.

Actions #13

Updated by Patrick Donnelly about 1 year ago

Venky Shankar wrote in #note-10:

Milind Changire wrote in #note-9:

I've searched for Scrub error and Segmentation fault in the logs and these are possibly the first runs with Segmentation fault pointing to thread MR_Finisher or md_log_replay:

[...]

oh, and also no core dumps accompanying the segfaults either.

oh, wow - this is happening from >2 years. So, we obviously need teuthology to fail a run when any ceph daemon crashes and ensure that the coredump survives.

The "watchdog" Jos wrote is supposed to fail a run but tearing down a running test is messy so it often appears to fail for other reasons. No idea why this particular test did not get torn down by the watchdog however. Perhaps because it's cephadm?

Actions #14

Updated by Venky Shankar about 1 year ago

Patrick Donnelly wrote in #note-13:

Venky Shankar wrote in #note-10:

Milind Changire wrote in #note-9:

I've searched for Scrub error and Segmentation fault in the logs and these are possibly the first runs with Segmentation fault pointing to thread MR_Finisher or md_log_replay:

[...]

oh, and also no core dumps accompanying the segfaults either.

oh, wow - this is happening from >2 years. So, we obviously need teuthology to fail a run when any ceph daemon crashes and ensure that the coredump survives.

The "watchdog" Jos wrote is supposed to fail a run but tearing down a running test is messy so it often appears to fail for other reasons. No idea why this particular test did not get torn down by the watchdog however. Perhaps because it's cephadm?

If that's the case then starting the qa suite isn't really reporting daemon crashes since quincy and that probably explains the long duration since the time this crash is being seen.

Coming to the crash itself, which is

 1: /lib64/libc.so.6(+0x3e930) [0x7f962dd40930]
 2: (tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned int, int)+0x93) [0x7f962ec92713]
 3: (tcmalloc::ThreadCache::Cleanup()+0x48) [0x7f962ec92818]
 4: (tcmalloc::ThreadCache::DeleteCache(tcmalloc::ThreadCache*)+0x12) [0x7f962ec92b82]
 5: /lib64/libc.so.6(+0x873c1) [0x7f962dd893c1]
 6: /lib64/libc.so.6(+0x8a166) [0x7f962dd8c166]
 7: /lib64/libc.so.6(+0x10f300) [0x7f962de11300]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Which looks like a fault when releasing memory back to tcmalloc. I'm not sure if this is something in tcmalloc or our code. (maybe using the libc allocator and running the test might give a more understandable backtrace?).

Actions #15

Updated by Milind Changire about 1 year ago

I've raised Orchestrator Bug #70247 since the containers just don't run for me anymore.
This issue seems to have resurfaced.
I'm not sure if its because of the cephadm commit that I have in the branch.

Actions #16

Updated by Venky Shankar about 1 year ago

  • Related to Bug #70247: Non-zero exit code 1 from systemctl reset-failed ceph-47356c0e-f761-11ef-bb88-bd4984dce30f@mon.a added
Actions #17

Updated by Milind Changire about 1 year ago

https://pulpito.ceph.com/mchangir-2025-03-03_17:42:13-fs:workload-wip-mchangir-use-libc-for-segfault-main-debug-testing-default-smithi/8167010

The above job was using a build with the libc allocator (not the tcmalloc allocator):

here's the stack trace of the crash in the mgr:

    -2> 2025-03-03T18:20:37.037+0000 7f3289ffb640 10 log_client handle_log_ack log(last 374)
    -1> 2025-03-03T18:20:37.037+0000 7f3289ffb640 10 log_client  logged 2025-03-03T18:20:35.945206+0000 mgr.x (mgr.14232) 373 : cluster [DBG] pgmap v353: 129 pgs: 129 active+clean; 651 KiB data, 388 MiB used, 1.0 TiB / 1.0 TiB avail; 4.0 KiB/s rd, 682 B/s wr, 6 op/s
     0> 2025-03-03T18:20:37.039+0000 7f3289ffb640 -1 *** Caught signal (Aborted) **
 in thread 7f3289ffb640 thread_name:ms_dispatch

 ceph version 19.3.0-7772-gcfa5ba05 (cfa5ba052b03f5b29c75de806210d0bdc7462583) squid (dev)
 1: /lib64/libc.so.6(+0x3ebf0) [0x7f32a0fc8bf0]
 2: /lib64/libc.so.6(+0x8bd4c) [0x7f32a1015d4c]
 3: raise()
 4: abort()
 5: /lib64/libc.so.6(+0x29172) [0x7f32a0fb3172]
 6: /lib64/libc.so.6(+0x95df7) [0x7f32a101fdf7]
 7: /lib64/libc.so.6(+0x97b5a) [0x7f32a1021b5a]
 8: free()
 9: /usr/lib64/ceph/libceph-common.so.2(+0x215039) [0x7f32a16c7039]
 10: /usr/lib64/ceph/libceph-common.so.2(+0x216cd6) [0x7f32a16c8cd6]
 11: (LogClient::handle_log_ack(MLogAck*)+0x62b) [0x7f32a16d241f]
 12: (MonClient::ms_dispatch(Message*)+0x556) [0x7f32a1a0f8fa]
 13: /usr/lib64/ceph/libceph-common.so.2(+0x554fa6) [0x7f32a1a06fa6]
 14: /usr/lib64/ceph/libceph-common.so.2(+0x3a8255) [0x7f32a185a255]
 15: (DispatchQueue::entry()+0x663) [0x7f32a185abfb]
 16: /usr/lib64/ceph/libceph-common.so.2(+0x4965eb) [0x7f32a19485eb]
 17: (Thread::entry_wrapper()+0x33) [0x7f32a16e4c63]
 18: (Thread::_entry_func(void*)+0xd) [0x7f32a16e4c79]
 19: /lib64/libc.so.6(+0x8a002) [0x7f32a1014002]
 20: /lib64/libc.so.6(+0x10f070) [0x7f32a1099070]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

I'll be trying out builds with:
  1. -fsanitize=address
  2. valgrind

to see if we can catch this earlier

Actions #18

Updated by Milind Changire about 1 year ago

libtool hates -fsanitize=address
the only way I can instrument the code with the Address Sanitizer is to disable the use of libtool during the build
however, there are many projects and dependent RPMs which depend on libtool ... earlier I thought it was only the erasure-code project
I even resorted to removing the libtool RPM from the system in the %build phase ... which revealed other package dependencies on libtool

there's some doc and YAMLs in the teuthology repo under the docs/laptop/ dir
I'm going to see if a mock teuthology setup is possible on my laptop and if I can run the tests locally.

Actions #19

Updated by Milind Changire about 1 year ago

since the build on my laptop doesn't invoke libtool (AFAICT), the only way forward seems to be is to build with -fsanitize=address and build the cluster manually and run the fs:workload tests and await a Segmentation fault and eventually a core-dump

Actions #20

Updated by Milind Changire about 1 year ago

So far I've been able to quiesce two use-after-free issues:
1. mgr: pertaining to dereference issue with MgrOpRequest (quiesced by adding a ref)
2. mds: pertaining to dereference issue with LogSegment (quiesced by making LogSegment a RefCountedObj and adding ref get() and put() calls)

fyi - extensive suite-wide tests have not been exercised yet

I still have 1 use-after-free issue to address.
This is related to a continuation context being referenced after being destroyed: C_Flush_Journal

Actions #21

Updated by Milind Changire about 1 year ago · Edited

further investigation reveals ...

C_Flush_Journal::expire_segments() adds new sub to the gather context for each expiring segment
however, the context added to the gather context calls C_Flush_Journal::trim_expired_segments()
and this function further down the line calls C_Flush_Journal::complete() ... which destroys the C_Flush_Journal object
so the second invocation of the gather context of the expiring segments proceeds to invoke C_Flush_Journal::trim_expired_segments() again
... leading to a use-after-free of the C_Flush_Journal object

should I make the C_Flush_Journal a RefCountedObj object as well ?

the above issue is seen via the "flush journal" asok command where the C_Flush_Journal object is created

there's a second instance of creation of the C_Flush_Journal object that needs to be investigated as well

Actions #22

Updated by Milind Changire about 1 year ago

for the C_FLush_Journal issue, I wonder if it would be sufficient to associate the labmda context completion to the last of the expiring segments i.e. would it be safe to assume that the expiring happens in "order" so that we don't need to add ref counting to C_Flush_Journal

Actions #23

Updated by Venky Shankar about 1 year ago

Milind Changire wrote in #note-21:

further investigation reveals ...

C_Flush_Journal::expire_segments() adds new sub to the gather context for each expiring segment
however, the context added to the gather context calls C_Flush_Journal::trim_expired_segments()
and this function further down the line calls C_Flush_Journal::complete() ... which destroys the C_Flush_Journal object
so the second invocation of the gather context of the expiring segments proceeds to invoke C_Flush_Journal::trim_expired_segments() again
... leading to a use-after-free of the C_Flush_Journal object

should I make the C_Flush_Journal a RefCountedObj object as well ?

The gather completion would only be called when the gather context is activated and when all subs finish. In this case, C_Flush_Journal::trim_expired_segments() would only be called after expiry_gather.activate() is invoked (in C_Flush_Journal::expire_segments()@) and all gather subs finish.

Actions #24

Updated by Milind Changire about 1 year ago

okay, here's the update about C_Flush_Journal:

  void trim_expired_segments() {
    ceph_assert(ceph_mutex_is_locked_by_me(mds->mds_lock));
    dout(5) << __func__ << ": expiry complete, expire_pos/trim_pos is now " 
            << std::hex << mdlog->get_journaler()->get_expire_pos() << "/" 
            << mdlog->get_journaler()->get_trimmed_pos() << dendl;

    // Now everyone I'm interested in is expired
    auto* ctx = new MDSInternalContextWrapper(mds, new LambdaContext([this](int r) {
      handle_write_head(r);
    }));
    mdlog->trim_expired_segments(ctx);

    dout(5) << __func__ << ": trimming is complete; wait for journal head write. Journal expire_pos/trim_pos is now " 
            << std::hex << mdlog->get_journaler()->get_expire_pos() << "/" 
            << mdlog->get_journaler()->get_trimmed_pos() << dendl;
  }

Venky helped to identify that the last dout that gets executed after the context completion is the culprit due to references to data members in the C_Flush_Journal object.

Actions #25

Updated by Milind Changire about 1 year ago

another use-after-free event in C_Flush_Journal ...

C_Flush_Journal::flush_mdlog() creates a subtreemap event and submits it to MDLog. The event then gets destroyed after handing over to the Journal. C_Flush_Journal::flush_mdlog() then reads the sequence number of the submitted event ... which is a use-after-free violation.

Actions #26

Updated by Milind Changire 12 months ago

  • Related to Bug #70624: qa: assertion failure on context completion of C_MDS_RetryRequest added
  • Related to Bug #70761: qa: mds crash and traceback seen when running fs:workload suite added
  • Related to Bug #70723: qa: AddressSanitizer reports heap-use-after-free in mds-log-replay thread added
Actions #27

Updated by Venky Shankar 12 months ago

  • Status changed from New to Fix Under Review
  • Backport set to reef,squid
  • Pull request ID set to 62553
Actions #28

Updated by Venky Shankar 11 months ago

  • Status changed from Fix Under Review to Pending Backport
Actions #29

Updated by Upkeep Bot 11 months ago

  • Copied to Backport #70924: reef: mds: segmentation faults in recent QA added
Actions #30

Updated by Upkeep Bot 11 months ago

  • Copied to Backport #70925: squid: mds: segmentation faults in recent QA added
Actions #31

Updated by Upkeep Bot 11 months ago

  • Tags (freeform) set to backport_processed
Actions #32

Updated by Venky Shankar 11 months ago

  • Status changed from Pending Backport to Fix Under Review

This is an umbrella tracker - other fixes are still under review.

Actions #33

Updated by Patrick Donnelly 9 months ago

  • Status changed from Fix Under Review to Pending Backport
  • Backport changed from reef,squid to tentacle,squid
Actions #34

Updated by Upkeep Bot 9 months ago

  • Merge Commit set to 1a947b3b1273f040cb2ef904cd9b4d02e3978120
  • Fixed In set to v20.0.0-1424-g1a947b3b127
  • Upkeep Timestamp set to 2025-07-08T18:07:30+00:00
Actions #35

Updated by Venky Shankar 8 months ago

  • Related to Bug #71996: cluster [WRN] Health check failed: 1 failed cephadm daemon(s) (CEPHADM_FAILED_DAEMON)" added
Actions #36

Updated by Upkeep Bot 8 months ago

  • Fixed In changed from v20.0.0-1424-g1a947b3b127 to v20.0.0-1424-g1a947b3b1273
  • Upkeep Timestamp changed from 2025-07-08T18:07:30+00:00 to 2025-07-14T15:21:59+00:00
Actions #37

Updated by Upkeep Bot 8 months ago

  • Fixed In changed from v20.0.0-1424-g1a947b3b1273 to v20.0.0-1424-g1a947b3b12
  • Upkeep Timestamp changed from 2025-07-14T15:21:59+00:00 to 2025-07-14T20:46:27+00:00
Actions #38

Updated by Milind Changire 7 months ago

  • Status changed from Pending Backport to New

trying to trigger backport tracker cloning for tentacle

Actions #39

Updated by Milind Changire 7 months ago

  • Status changed from New to Pending Backport
Actions #40

Updated by Venky Shankar 7 months ago

Milind Changire wrote in #note-38:

trying to trigger backport tracker cloning for tentacle

These should already be in tentacle branch, isn't it?

Actions #41

Updated by Milind Changire 7 months ago

  • Status changed from Pending Backport to Fix Under Review
  • Tags (freeform) deleted (backport_processed)

trying to retrigger tracker cloning to tentacle

Actions #42

Updated by Milind Changire 7 months ago

  • Status changed from Fix Under Review to Pending Backport
Actions #43

Updated by Upkeep Bot 7 months ago

  • Copied to Backport #72653: tentacle: mds: segmentation faults in recent QA added
Actions #44

Updated by Upkeep Bot 7 months ago

  • Tags (freeform) set to backport_processed
Actions #45

Updated by Milind Changire 7 months ago

Venky Shankar wrote in #note-40:

Milind Changire wrote in #note-38:

trying to trigger backport tracker cloning for tentacle

These should already be in tentacle branch, isn't it?

yes ... indeed they are
didn't check that before
sorry for the confusion

I've restored the backport_processed tag as well

Actions #46

Updated by Venky Shankar 7 months ago

Milind Changire wrote in #note-45:

Venky Shankar wrote in #note-40:

Milind Changire wrote in #note-38:

trying to trigger backport tracker cloning for tentacle

These should already be in tentacle branch, isn't it?

yes ... indeed they are
didn't check that before
sorry for the confusion

I've restored the backport_processed tag as well

Yeh. But the bot created the backport tracker which should be closed now.

Milind Changire wrote in #note-45:

Venky Shankar wrote in #note-40:

Milind Changire wrote in #note-38:

trying to trigger backport tracker cloning for tentacle

These should already be in tentacle branch, isn't it?

yes ... indeed they are
didn't check that before
sorry for the confusion

I've restored the backport_processed tag as well

Yeh. But the bot created backport trackers which should be closed now.

Actions #47

Updated by Upkeep Bot 5 months ago

  • Released In set to v20.2.0~584
  • Upkeep Timestamp changed from 2025-07-14T20:46:27+00:00 to 2025-11-01T01:00:33+00:00
Actions #48

Updated by Venky Shankar about 2 months ago

  • Assignee changed from Milind Changire to Mahesh Mohan
  • Tags (freeform) changed from backport_processed to temp-assign
Actions #49

Updated by Upkeep Bot about 2 months ago

  • Tags (freeform) changed from temp-assign to temp-assign backport_processed
Actions

Also available in: Atom PDF