Project

General

Profile

Actions

Bug #67696

open

ceph-mgr segfaults upon startup

Added by Kaleb KEITHLEY over 1 year ago. Updated 7 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Tags (freeform):
Merge Commit:
Fixed In:
Released In:
Upkeep Timestamp:

Description

see https://bugzilla.redhat.com/show_bug.cgi?id=2307587

In an effort to prevent a re-occurrence of #2255688, I setup a full test cluster running the current (just barely pre-beta) version of F41, as well as a series of scripts to bulk provision and configure a ceph cluster.

  • Good news: most things work -- including basic cephfs and rbd tests.
  • Bad news: ceph-mgr just instantly segfaults upon starting.
  • Potentially worse news: the stack trace looks like it's crashing within the python libraries.

Reproducible: Always

Steps to Reproduce:
1. Build three new hosts (c1, c2, and c3), install F41. Install ceph, ceph-mds, ceph-mgr, ceph-mgr-dashboard, ceph-mon, ceph-osd, ceph-radosgw.
2. Follow the build documentation for ceph (https://docs.ceph.com/en/reef/install/index_manual/) to instantiate a cluster with three mons and no other services. Rough sample:

On c1:
  1. curl -s http://10.254.101.1/ceph/ceph.conf -o /etc/ceph/ceph.conf
  2. curl -s http://10.254.101.1/ceph/ceph.client.admin.keyring -o /etc/ceph/ceph.client.admin.keyring
  3. chmod 600 /etc/ceph/ceph.client.admin.keyring
  4. curl -s http://10.254.101.1/ceph/ceph.mon.keyring -o /tmp/ceph.mon.keyring
  5. chown ceph:ceph /tmp/ceph.mon.keyring
  6. monmaptool --create --add c1 10.254.101.30 --fsid eba9a362-2d80-4860-94cc-f48b00a091bc /tmp/monmap --set-min-mon-release reef
  7. sudo -u ceph ceph-mon --mkfs -i c1 --monmap /tmp/monmap --keyring /tmp/ceph.mon.keyring
  8. systemctl enable ceph-mon@c1
  9. systemctl start ceph-mon@c1
  10. rm -f /tmp/monmap /tmp/ceph.mon.keyring
  11. ceph mon enable-msgr2
  12. ceph config set mon auth_allow_insecure_global_id_reclaim false
On c2 and c3:
  1. curl -s http://10.254.101.1/ceph/ceph.conf -o /etc/ceph/ceph.conf
  2. curl -s http://10.254.101.1/ceph/ceph.client.admin.keyring -o /etc/ceph/ceph.client.admin.keyring
  3. chmod 600 /etc/ceph/ceph.client.admin.keyring
  4. ceph auth get mon. -o /tmp/ceph-mon-keyring
  5. ceph mon getmap -o /tmp/monmap
  6. sudo -u ceph ceph-mon --mkfs -i $(hostname -s) --monmap /tmp/monmap --keyring /tmp/ceph-mon-keyring
  7. systemctl enable ceph-mon@$(hostname -s)
  8. systemctl start ceph-mon@$(hostname -s)
  9. rm -f /tmp/ceph-mon-keyring /tmp/monmap
3. Check the health of the cluster and ensure that all three mons are running and in quorum:
  1. ceph -s
    cluster:
    id: eba9a362-2d80-4860-94cc-f48b00a091bc
    health: HEALTH_WARN
    no active mgr
services:
mon: 3 daemons, quorum c1,c2,c3 (age 12m)
mgr: no daemons active
4. Create the keyrings for ceph-mgr and attempt to start ceph-mgr.
  1. mkdir /var/lib/ceph/mgr/ceph-$(hostname -s)
  2. chown ceph:ceph /var/lib/ceph/mgr/ceph-$(hostname -s)
  3. chmod 0755 /var/lib/ceph/mgr/ceph-$(hostname -s)
  4. ceph auth get-or-create mgr.$(hostname s) mon 'allow profile mgr' osd 'allow *' mds 'allow *' -o /var/lib/ceph/mgr/ceph$(hostname -s)/keyring
  5. chown ceph:ceph /var/lib/ceph/mgr/ceph-$(hostname -s)/keyring
  6. chmod 0640 /var/lib/ceph/mgr/ceph-$(hostname -s)/keyring
  7. systemctl enable ceph-mgr@$(hostname -s)
  8. systemctl start ceph-mgr@$(hostname -s)
Actual Results:
  1. systemctl start ceph-mgr@c1
  2. systemctl status ceph-mgr@c1
    - Ceph cluster manager daemon
    Loaded: loaded (/usr/lib/systemd/system/ceph-mgr@.service; enabled; preset: disabled)
    Drop-In: /usr/lib/systemd/system/service.d
    └─10-timeout-abort.conf
    Active: inactive (dead) since Fri 2024-08-23 08:46:41 MDT; 10min ago
    Duration: 457ms
    Invocation: a84e6d3cb0da4ab4b4943bdaca694212
    Process: 7847 ExecStart=/usr/bin/ceph-mgr -f --cluster ${CLUSTER} --id c1 --setuser ceph --setgroup ceph (code=dumped, signal=SEGV)
    Main PID: 7847 (code=dumped, signal=SEGV)

Aug 23 08:46:41 c1 systemd1: : Scheduled restart job, restart counter is at 3.
Aug 23 08:46:41 c1 systemd1: : Start request repeated too quickly.
Aug 23 08:46:41 c1 systemd1: : Failed with result 'core-dump'.
Aug 23 08:46:41 c1 systemd1: Failed to start - Ceph cluster manager daemon.

  1. journalctl -lfn400 -u ceph-mgr@c1
    Aug 23 08:46:20 c1 systemd1: Started - Ceph cluster manager daemon.
    Aug 23 08:46:20 c1 ceph-mgr7811: * Caught signal (Segmentation fault) *
    Aug 23 08:46:20 c1 ceph-mgr7811: in thread 7f94759e4180 thread_name:ceph-mgr
    Aug 23 08:46:20 c1 ceph-mgr7811: ceph version 19.1.0 (9025b9024baf597d63005552b5ee004013630404) squid (rc)
    Aug 23 08:46:20 c1 ceph-mgr7811: 1: /lib64/libc.so.6(+0x19dc0) [0x7f9476627dc0]
    Aug 23 08:46:20 c1 ceph-mgr7811: 2: /lib64/libpython3.13.so.1.0(+0x28a6fb) [0x7f947808a6fb]
    Aug 23 08:46:20 c1 ceph-mgr7811: 3: /lib64/libpython3.13.so.1.0(+0x2d042) [0x7f9477e2d042]
    Aug 23 08:46:20 c1 ceph-mgr7811: 4: /lib64/libpython3.13.so.1.0(+0x254fba) [0x7f9478054fba]
    Aug 23 08:46:20 c1 ceph-mgr7811: 5: /lib64/libpython3.13.so.1.0(+0x1739cb) [0x7f9477f739cb]
    Aug 23 08:46:20 c1 ceph-mgr7811: 6: _PyEval_EvalFrameDefault()
    Aug 23 08:46:20 c1 ceph-mgr7811: 7: /lib64/libpython3.13.so.1.0(+0x1700d8) [0x7f9477f700d8]
    Aug 23 08:46:20 c1 ceph-mgr7811: 8: PyObject_CallMethodObjArgs()
    Aug 23 08:46:20 c1 ceph-mgr7811: 9: PyImport_ImportModuleLevelObject()
    Aug 23 08:46:20 c1 ceph-mgr7811: 10: _PyEval_EvalFrameDefault()
    Aug 23 08:46:20 c1 ceph-mgr7811: 11: PyEval_EvalCode()
    Aug 23 08:46:20 c1 ceph-mgr7811: 12: /lib64/libpython3.13.so.1.0(+0x236cb4) [0x7f9478036cb4]
    Aug 23 08:46:20 c1 ceph-mgr7811: 13: /lib64/libpython3.13.so.1.0(+0x168c47) [0x7f9477f68c47]
    Aug 23 08:46:20 c1 ceph-mgr7811: 14: _PyEval_EvalFrameDefault()
    Aug 23 08:46:20 c1 ceph-mgr7811: 15: /lib64/libpython3.13.so.1.0(+0x1700d8) [0x7f9477f700d8]
    Aug 23 08:46:20 c1 ceph-mgr7811: 16: PyObject_CallMethodObjArgs()
    Aug 23 08:46:20 c1 ceph-mgr7811: 17: PyImport_ImportModuleLevelObject()
    Aug 23 08:46:20 c1 ceph-mgr7811: 18: _PyEval_EvalFrameDefault()
    Aug 23 08:46:20 c1 ceph-mgr7811: 19: PyEval_EvalCode()
    Aug 23 08:46:20 c1 ceph-mgr7811: 20: /lib64/libpython3.13.so.1.0(+0x236cb4) [0x7f9478036cb4]
    Aug 23 08:46:20 c1 ceph-mgr7811: 21: /lib64/libpython3.13.so.1.0(+0x168c47) [0x7f9477f68c47]
    Aug 23 08:46:20 c1 ceph-mgr7811: 22: _PyEval_EvalFrameDefault()
    Aug 23 08:46:20 c1 ceph-mgr7811: 23: /lib64/libpython3.13.so.1.0(+0x1700d8) [0x7f9477f700d8]
    Aug 23 08:46:20 c1 ceph-mgr7811: 24: PyObject_CallMethodObjArgs()
    Aug 23 08:46:20 c1 ceph-mgr7811: 25: PyImport_ImportModuleLevelObject()
    Aug 23 08:46:20 c1 ceph-mgr7811: 26: _PyEval_EvalFrameDefault()
    Aug 23 08:46:20 c1 ceph-mgr7811: 27: PyEval_EvalCode()
    Aug 23 08:46:20 c1 ceph-mgr7811: 28: /lib64/libpython3.13.so.1.0(+0x236cb4) [0x7f9478036cb4]
    Aug 23 08:46:20 c1 ceph-mgr7811: 29: /lib64/libpython3.13.so.1.0(+0x168c47) [0x7f9477f68c47]
    Aug 23 08:46:20 c1 ceph-mgr7811: 30: _PyEval_EvalFrameDefault()
    Aug 23 08:46:20 c1 ceph-mgr7811: 31: /lib64/libpython3.13.so.1.0(+0x1700d8) [0x7f9477f700d8]
    Aug 23 08:46:20 c1 ceph-mgr7811: 2024-08-23T08:46:20.785-0600 7f94759e4180 -1 *
    Caught signal (Segmentation fault)
    Aug 23 08:46:20 c1 ceph-mgr7811: in thread 7f94759e4180 thread_name:ceph-mgr
    Aug 23 08:46:20 c1 ceph-mgr7811: ceph version 19.1.0 (9025b9024baf597d63005552b5ee004013630404) squid (rc)
    Aug 23 08:46:20 c1 ceph-mgr7811: 1: /lib64/libc.so.6(+0x19dc0) [0x7f9476627dc0]
    Aug 23 08:46:20 c1 ceph-mgr7811: 2: /lib64/libpython3.13.so.1.0(+0x28a6fb) [0x7f947808a6fb]
    Aug 23 08:46:20 c1 ceph-mgr7811: 3: /lib64/libpython3.13.so.1.0(+0x2d042) [0x7f9477e2d042]
    Aug 23 08:46:20 c1 ceph-mgr7811: 4: /lib64/libpython3.13.so.1.0(+0x254fba) [0x7f9478054fba]
    Aug 23 08:46:20 c1 ceph-mgr7811: 5: /lib64/libpython3.13.so.1.0(+0x1739cb) [0x7f9477f739cb]
    Aug 23 08:46:20 c1 ceph-mgr7811: 6: _PyEval_EvalFrameDefault()
    Aug 23 08:46:20 c1 ceph-mgr7811: 7: /lib64/libpython3.13.so.1.0(+0x1700d8) [0x7f9477f700d8]
    Aug 23 08:46:20 c1 ceph-mgr7811: 8: PyObject_CallMethodObjArgs()
    Aug 23 08:46:20 c1 ceph-mgr7811: 9: PyImport_ImportModuleLevelObject()
    Aug 23 08:46:20 c1 ceph-mgr7811: 10: _PyEval_EvalFrameDefault()
    Aug 23 08:46:20 c1 ceph-mgr7811: 11: PyEval_EvalCode()
    Aug 23 08:46:20 c1 ceph-mgr7811: 12: /lib64/libpython3.13.so.1.0(+0x236cb4) [0x7f9478036cb4]
    Aug 23 08:46:20 c1 ceph-mgr7811: 13: /lib64/libpython3.13.so.1.0(+0x168c47) [0x7f9477f68c47]
    Aug 23 08:46:20 c1 ceph-mgr7811: 14: _PyEval_EvalFrameDefault()
    Aug 23 08:46:20 c1 ceph-mgr7811: 15: /lib64/libpython3.13.so.1.0(+0x1700d8) [0x7f9477f700d8]
    Aug 23 08:46:20 c1 ceph-mgr7811: 16: PyObject_CallMethodObjArgs()
    Aug 23 08:46:20 c1 ceph-mgr7811: 17: PyImport_ImportModuleLevelObject()
    Aug 23 08:46:20 c1 ceph-mgr7811: 18: _PyEval_EvalFrameDefault()
    Aug 23 08:46:20 c1 ceph-mgr7811: 19: PyEval_EvalCode()
    Aug 23 08:46:20 c1 ceph-mgr7811: 20: /lib64/libpython3.13.so.1.0(+0x236cb4) [0x7f9478036cb4]
    Aug 23 08:46:20 c1 ceph-mgr7811: 21: /lib64/libpython3.13.so.1.0(+0x168c47) [0x7f9477f68c47]
    Aug 23 08:46:20 c1 ceph-mgr7811: 22: _PyEval_EvalFrameDefault()
    Aug 23 08:46:20 c1 ceph-mgr7811: 23: /lib64/libpython3.13.so.1.0(+0x1700d8) [0x7f9477f700d8]
    Aug 23 08:46:20 c1 ceph-mgr7811: 24: PyObject_CallMethodObjArgs()
    Aug 23 08:46:20 c1 ceph-mgr7811: 25: PyImport_ImportModuleLevelObject()
    Aug 23 08:46:20 c1 ceph-mgr7811: 26: _PyEval_EvalFrameDefault()
    Aug 23 08:46:20 c1 ceph-mgr7811: 27: PyEval_EvalCode()
    Aug 23 08:46:20 c1 ceph-mgr7811: 28: /lib64/libpython3.13.so.1.0(+0x236cb4) [0x7f9478036cb4]
    Aug 23 08:46:20 c1 ceph-mgr7811: 29: /lib64/libpython3.13.so.1.0(+0x168c47) [0x7f9477f68c47]
    Aug 23 08:46:20 c1 ceph-mgr7811: 30: _PyEval_EvalFrameDefault()
    Aug 23 08:46:20 c1 ceph-mgr7811: 31: /lib64/libpython3.13.so.1.0(+0x1700d8) [0x7f9477f700d8]
    Aug 23 08:46:20 c1 ceph-mgr7811: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
    Aug 23 08:46:20 c1 ceph-mgr7811: 0> 2024-08-23T08:46:20.785-0600 7f94759e4180 -1 *
    Caught signal (Segmentation fault)
    Aug 23 08:46:20 c1 ceph-mgr7811: in thread 7f94759e4180 thread_name:ceph-mgr
    Aug 23 08:46:20 c1 ceph-mgr7811: ceph version 19.1.0 (9025b9024baf597d63005552b5ee004013630404) squid (rc)
    Aug 23 08:46:20 c1 ceph-mgr7811: 1: /lib64/libc.so.6(+0x19dc0) [0x7f9476627dc0]
    Aug 23 08:46:20 c1 ceph-mgr7811: 2: /lib64/libpython3.13.so.1.0(+0x28a6fb) [0x7f947808a6fb]
    Aug 23 08:46:20 c1 ceph-mgr7811: 3: /lib64/libpython3.13.so.1.0(+0x2d042) [0x7f9477e2d042]
    Aug 23 08:46:20 c1 ceph-mgr7811: 4: /lib64/libpython3.13.so.1.0(+0x254fba) [0x7f9478054fba]
    Aug 23 08:46:20 c1 ceph-mgr7811: 5: /lib64/libpython3.13.so.1.0(+0x1739cb) [0x7f9477f739cb]
    Aug 23 08:46:20 c1 ceph-mgr7811: 6: _PyEval_EvalFrameDefault()
    Aug 23 08:46:20 c1 ceph-mgr7811: 7: /lib64/libpython3.13.so.1.0(+0x1700d8) [0x7f9477f700d8]
    Aug 23 08:46:20 c1 ceph-mgr7811: 8: PyObject_CallMethodObjArgs()
    Aug 23 08:46:20 c1 ceph-mgr7811: 9: PyImport_ImportModuleLevelObject()
    Aug 23 08:46:20 c1 ceph-mgr7811: 10: _PyEval_EvalFrameDefault()
    Aug 23 08:46:20 c1 ceph-mgr7811: 11: PyEval_EvalCode()
    Aug 23 08:46:20 c1 ceph-mgr7811: 12: /lib64/libpython3.13.so.1.0(+0x236cb4) [0x7f9478036cb4]
    Aug 23 08:46:20 c1 ceph-mgr7811: 13: /lib64/libpython3.13.so.1.0(+0x168c47) [0x7f9477f68c47]
    Aug 23 08:46:20 c1 ceph-mgr7811: 14: _PyEval_EvalFrameDefault()
    Aug 23 08:46:20 c1 ceph-mgr7811: 15: /lib64/libpython3.13.so.1.0(+0x1700d8) [0x7f9477f700d8]
    Aug 23 08:46:20 c1 ceph-mgr7811: 16: PyObject_CallMethodObjArgs()
    Aug 23 08:46:20 c1 ceph-mgr7811: 17: PyImport_ImportModuleLevelObject()
    Aug 23 08:46:20 c1 ceph-mgr7811: 18: _PyEval_EvalFrameDefault()
    Aug 23 08:46:20 c1 ceph-mgr7811: 19: PyEval_EvalCode()
    Aug 23 08:46:20 c1 ceph-mgr7811: 20: /lib64/libpython3.13.so.1.0(+0x236cb4) [0x7f9478036cb4]
    Aug 23 08:46:20 c1 ceph-mgr7811: 21: /lib64/libpython3.13.so.1.0(+0x168c47) [0x7f9477f68c47]
    Aug 23 08:46:20 c1 ceph-mgr7811: 22: _PyEval_EvalFrameDefault()
    Aug 23 08:46:20 c1 ceph-mgr7811: 23: /lib64/libpython3.13.so.1.0(+0x1700d8) [0x7f9477f700d8]
    Aug 23 08:46:20 c1 ceph-mgr7811: 24: PyObject_CallMethodObjArgs()
    Aug 23 08:46:20 c1 ceph-mgr7811: 25: PyImport_ImportModuleLevelObject()
    Aug 23 08:46:20 c1 ceph-mgr7811: 26: _PyEval_EvalFrameDefault()
    Aug 23 08:46:20 c1 ceph-mgr7811: 27: PyEval_EvalCode()
    Aug 23 08:46:20 c1 ceph-mgr7811: 28: /lib64/libpython3.13.so.1.0(+0x236cb4) [0x7f9478036cb4]
    Aug 23 08:46:20 c1 ceph-mgr7811: 29: /lib64/libpython3.13.so.1.0(+0x168c47) [0x7f9477f68c47]
    Aug 23 08:46:20 c1 ceph-mgr7811: 30: _PyEval_EvalFrameDefault()
    Aug 23 08:46:20 c1 ceph-mgr7811: 31: /lib64/libpython3.13.so.1.0(+0x1700d8) [0x7f9477f700d8]
    Aug 23 08:46:20 c1 ceph-mgr7811: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
    Aug 23 08:46:20 c1 ceph-mgr7811: 0> 2024-08-23T08:46:20.785-0600 7f94759e4180 -1 *
    Caught signal (Segmentation fault) *

    Aug 23 08:46:20 c1 ceph-mgr7811: in thread 7f94759e4180 thread_name:ceph-mgr
    Aug 23 08:46:20 c1 ceph-mgr7811: ceph version 19.1.0 (9025b9024baf597d63005552b5ee004013630404) squid (rc)
    Aug 23 08:46:20 c1 ceph-mgr7811: 1: /lib64/libc.so.6(+0x19dc0) [0x7f9476627dc0]
    Aug 23 08:46:20 c1 ceph-mgr7811: 2: /lib64/libpython3.13.so.1.0(+0x28a6fb) [0x7f947808a6fb]
    Aug 23 08:46:20 c1 ceph-mgr7811: 3: /lib64/libpython3.13.so.1.0(+0x2d042) [0x7f9477e2d042]
    Aug 23 08:46:20 c1 ceph-mgr7811: 4: /lib64/libpython3.13.so.1.0(+0x254fba) [0x7f9478054fba]
    Aug 23 08:46:20 c1 ceph-mgr7811: 5: /lib64/libpython3.13.so.1.0(+0x1739cb) [0x7f9477f739cb]
    Aug 23 08:46:20 c1 ceph-mgr7811: 6: _PyEval_EvalFrameDefault()
    Aug 23 08:46:20 c1 ceph-mgr7811: 7: /lib64/libpython3.13.so.1.0(+0x1700d8) [0x7f9477f700d8]
    Aug 23 08:46:20 c1 ceph-mgr7811: 8: PyObject_CallMethodObjArgs()
    Aug 23 08:46:20 c1 ceph-mgr7811: 9: PyImport_ImportModuleLevelObject()
    Aug 23 08:46:20 c1 ceph-mgr7811: 10: _PyEval_EvalFrameDefault()
    Aug 23 08:46:20 c1 ceph-mgr7811: 11: PyEval_EvalCode()
    Aug 23 08:46:20 c1 ceph-mgr7811: 12: /lib64/libpython3.13.so.1.0(+0x236cb4) [0x7f9478036cb4]
    Aug 23 08:46:20 c1 ceph-mgr7811: 13: /lib64/libpython3.13.so.1.0(+0x168c47) [0x7f9477f68c47]
    Aug 23 08:46:20 c1 ceph-mgr7811: 14: _PyEval_EvalFrameDefault()
    Aug 23 08:46:20 c1 ceph-mgr7811: 15: /lib64/libpython3.13.so.1.0(+0x1700d8) [0x7f9477f700d8]
    Aug 23 08:46:20 c1 ceph-mgr7811: 16: PyObject_CallMethodObjArgs()
    Aug 23 08:46:20 c1 ceph-mgr7811: 17: PyImport_ImportModuleLevelObject()
    Aug 23 08:46:20 c1 ceph-mgr7811: 18: _PyEval_EvalFrameDefault()
    Aug 23 08:46:20 c1 ceph-mgr7811: 19: PyEval_EvalCode()
    Aug 23 08:46:20 c1 ceph-mgr7811: 20: /lib64/libpython3.13.so.1.0(+0x236cb4) [0x7f9478036cb4]
    Aug 23 08:46:20 c1 ceph-mgr7811: 21: /lib64/libpython3.13.so.1.0(+0x168c47) [0x7f9477f68c47]
    Aug 23 08:46:20 c1 ceph-mgr7811: 22: _PyEval_EvalFrameDefault()
    Aug 23 08:46:20 c1 ceph-mgr7811: 23: /lib64/libpython3.13.so.1.0(+0x1700d8) [0x7f9477f700d8]
    Aug 23 08:46:20 c1 ceph-mgr7811: 24: PyObject_CallMethodObjArgs()
    Aug 23 08:46:20 c1 ceph-mgr7811: 25: PyImport_ImportModuleLevelObject()
    Aug 23 08:46:20 c1 ceph-mgr7811: 26: _PyEval_EvalFrameDefault()
    Aug 23 08:46:20 c1 ceph-mgr7811: 27: PyEval_EvalCode()
    Aug 23 08:46:20 c1 ceph-mgr7811: 28: /lib64/libpython3.13.so.1.0(+0x236cb4) [0x7f9478036cb4]
    Aug 23 08:46:20 c1 ceph-mgr7811: 29: /lib64/libpython3.13.so.1.0(+0x168c47) [0x7f9477f68c47]
    Aug 23 08:46:20 c1 ceph-mgr7811: 30: _PyEval_EvalFrameDefault()
    Aug 23 08:46:20 c1 ceph-mgr7811: 31: /lib64/libpython3.13.so.1.0(+0x1700d8) [0x7f9477f700d8]
    Aug 23 08:46:20 c1 ceph-mgr7811: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
  1. cat /var/lib/ceph/crash/2024-08-23T14\:46\:31.292387Z_840d6b97-feec-467c-9311-fa076d10d498/meta {
    "crash_id": "2024-08-23T14:46:31.292387Z_840d6b97-feec-467c-9311-fa076d10d498",
    "timestamp": "2024-08-23T14:46:31.292387Z",
    "process_name": "ceph-mgr",
    "entity_name": "mgr.c1",
    "ceph_version": "19.1.0",
    "utsname_hostname": "c1",
    "utsname_sysname": "Linux",
    "utsname_release": "6.11.0-0.rc3.30.fc41.x86_64",
    "utsname_version": "#1 SMP PREEMPT_DYNAMIC Mon Aug 12 14:18:21 UTC 2024",
    "utsname_machine": "x86_64",
    "os_name": "Fedora Linux",
    "os_id": "fedora",
    "os_version_id": "41",
    "os_version": "41 (Forty One Prerelease)",
    "backtrace": [
    "/lib64/libc.so.6(+0x19dc0) [0x7ff204627dc0]",
    "/lib64/libpython3.13.so.1.0(+0x28a6fb) [0x7ff204a8a6fb]",
    "/lib64/libpython3.13.so.1.0(+0x2d042) [0x7ff20482d042]",
    "/lib64/libpython3.13.so.1.0(+0x254fba) [0x7ff204a54fba]",
    "/lib64/libpython3.13.so.1.0(+0x1739cb) [0x7ff2049739cb]",
    "_PyEval_EvalFrameDefault()",
    "/lib64/libpython3.13.so.1.0(+0x1700d8) [0x7ff2049700d8]",
    "PyObject_CallMethodObjArgs()",
    "PyImport_ImportModuleLevelObject()",
    "_PyEval_EvalFrameDefault()",
    "PyEval_EvalCode()",
    "/lib64/libpython3.13.so.1.0(+0x236cb4) [0x7ff204a36cb4]",
    "/lib64/libpython3.13.so.1.0(+0x168c47) [0x7ff204968c47]",
    "_PyEval_EvalFrameDefault()",
    "/lib64/libpython3.13.so.1.0(+0x1700d8) [0x7ff2049700d8]",
    "PyObject_CallMethodObjArgs()",
    "PyImport_ImportModuleLevelObject()",
    "_PyEval_EvalFrameDefault()",
    "PyEval_EvalCode()",
    "/lib64/libpython3.13.so.1.0(+0x236cb4) [0x7ff204a36cb4]",
    "/lib64/libpython3.13.so.1.0(+0x168c47) [0x7ff204968c47]",
    "_PyEval_EvalFrameDefault()",
    "/lib64/libpython3.13.so.1.0(+0x1700d8) [0x7ff2049700d8]",
    "PyObject_CallMethodObjArgs()",
    "PyImport_ImportModuleLevelObject()",
    "_PyEval_EvalFrameDefault()",
    "PyEval_EvalCode()",
    "/lib64/libpython3.13.so.1.0(+0x236cb4) [0x7ff204a36cb4]",
    "/lib64/libpython3.13.so.1.0(+0x168c47) [0x7ff204968c47]",
    "_PyEval_EvalFrameDefault()",
    "/lib64/libpython3.13.so.1.0(+0x1700d8) [0x7ff2049700d8]"
    ]
    }
  1. cat /var/lib/ceph/crash/2024-08-23T14\:46\:31.292387Z_840d6b97-feec-467c-9311-fa076d10d498/log
    --- begin dump of recent events ---
    -195> 2024-08-23T08:46:31.236-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command assert hook 0x5646a2e287c0
    -194> 2024-08-23T08:46:31.236-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command abort hook 0x5646a2e287c0
    -193> 2024-08-23T08:46:31.236-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command leak_some_memory hook 0x5646a2e287c0
    -192> 2024-08-23T08:46:31.236-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command perfcounters_dump hook 0x5646a2e287c0
    -191> 2024-08-23T08:46:31.236-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command 1 hook 0x5646a2e287c0
    -190> 2024-08-23T08:46:31.236-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command perf dump hook 0x5646a2e287c0
    -189> 2024-08-23T08:46:31.236-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command perfcounters_schema hook 0x5646a2e287c0
    -188> 2024-08-23T08:46:31.236-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command perf histogram dump hook 0x5646a2e287c0
    -187> 2024-08-23T08:46:31.236-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command 2 hook 0x5646a2e287c0
    -186> 2024-08-23T08:46:31.236-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command perf schema hook 0x5646a2e287c0
    -185> 2024-08-23T08:46:31.236-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command counter dump hook 0x5646a2e287c0
    -184> 2024-08-23T08:46:31.236-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command counter schema hook 0x5646a2e287c0
    -183> 2024-08-23T08:46:31.236-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command perf histogram schema hook 0x5646a2e287c0
    -182> 2024-08-23T08:46:31.236-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command perf reset hook 0x5646a2e287c0
    -181> 2024-08-23T08:46:31.236-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command config show hook 0x5646a2e287c0
    -180> 2024-08-23T08:46:31.236-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command config help hook 0x5646a2e287c0
    -179> 2024-08-23T08:46:31.236-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command config set hook 0x5646a2e287c0
    -178> 2024-08-23T08:46:31.236-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command config unset hook 0x5646a2e287c0
    -177> 2024-08-23T08:46:31.236-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command config get hook 0x5646a2e287c0
    -176> 2024-08-23T08:46:31.236-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command config diff hook 0x5646a2e287c0
    -175> 2024-08-23T08:46:31.236-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command config diff get hook 0x5646a2e287c0
    -174> 2024-08-23T08:46:31.236-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command injectargs hook 0x5646a2e287c0
    -173> 2024-08-23T08:46:31.236-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command log flush hook 0x5646a2e287c0
    -172> 2024-08-23T08:46:31.236-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command log dump hook 0x5646a2e287c0
    -171> 2024-08-23T08:46:31.236-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command log reopen hook 0x5646a2e287c0
    -170> 2024-08-23T08:46:31.236-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command dump_mempools hook 0x5646a3bc6068
    -169> 2024-08-23T08:46:31.239-0600 7ff20250d180 10 monclient: get_monmap_and_config
    -168> 2024-08-23T08:46:31.239-0600 7ff20250d180 10 monclient: build_initial_monmap
    -167> 2024-08-23T08:46:31.239-0600 7ff20250d180 1 build_initial for_mkfs: 0
    -166> 2024-08-23T08:46:31.239-0600 7ff20250d180 10 monclient: monmap:
    epoch 0
    fsid eba9a362-2d80-4860-94cc-f48b00a091bc
    last_changed 2024-08-23T08:46:31.239296-0600
    created 2024-08-23T08:46:31.239296-0600
    min_mon_release 0 (unknown)
    election_strategy: 1
    0: v1:10.254.101.30:6789/0 mon.noname-a-legacy
    1: v2:10.254.101.30:3300/0 mon.noname-a

    -165> 2024-08-23T08:46:31.239-0600 7ff20250d180 5 AuthRegistry(0x5646a2e5b340) adding auth protocol: cephx
    -164> 2024-08-23T08:46:31.239-0600 7ff20250d180 5 AuthRegistry(0x5646a2e5b340) adding auth protocol: cephx
    -163> 2024-08-23T08:46:31.239-0600 7ff20250d180 5 AuthRegistry(0x5646a2e5b340) adding auth protocol: cephx
    -162> 2024-08-23T08:46:31.239-0600 7ff20250d180 5 AuthRegistry(0x5646a2e5b340) adding con mode: secure
    -161> 2024-08-23T08:46:31.239-0600 7ff20250d180 5 AuthRegistry(0x5646a2e5b340) adding con mode: crc
    -160> 2024-08-23T08:46:31.239-0600 7ff20250d180 5 AuthRegistry(0x5646a2e5b340) adding con mode: secure
    -159> 2024-08-23T08:46:31.239-0600 7ff20250d180 5 AuthRegistry(0x5646a2e5b340) adding con mode: crc
    -158> 2024-08-23T08:46:31.239-0600 7ff20250d180 5 AuthRegistry(0x5646a2e5b340) adding con mode: secure
    -157> 2024-08-23T08:46:31.239-0600 7ff20250d180 5 AuthRegistry(0x5646a2e5b340) adding con mode: crc
    -156> 2024-08-23T08:46:31.239-0600 7ff20250d180 5 AuthRegistry(0x5646a2e5b340) adding con mode: crc
    -155> 2024-08-23T08:46:31.239-0600 7ff20250d180 5 AuthRegistry(0x5646a2e5b340) adding con mode: secure
    -154> 2024-08-23T08:46:31.239-0600 7ff20250d180 5 AuthRegistry(0x5646a2e5b340) adding con mode: crc
    -153> 2024-08-23T08:46:31.239-0600 7ff20250d180 5 AuthRegistry(0x5646a2e5b340) adding con mode: secure
    -152> 2024-08-23T08:46:31.239-0600 7ff20250d180 5 AuthRegistry(0x5646a2e5b340) adding con mode: crc
    -151> 2024-08-23T08:46:31.239-0600 7ff20250d180 5 AuthRegistry(0x5646a2e5b340) adding con mode: secure
    -150> 2024-08-23T08:46:31.239-0600 7ff20250d180 2 auth: KeyRing::load: loaded key file /var/lib/ceph/mgr/ceph-c1/keyring
    -149> 2024-08-23T08:46:31.239-0600 7ff20250d180 10 monclient: init
    -148> 2024-08-23T08:46:31.239-0600 7ff20250d180 5 AuthRegistry(0x7ffe8c3c8b20) adding auth protocol: cephx
    -147> 2024-08-23T08:46:31.239-0600 7ff20250d180 5 AuthRegistry(0x7ffe8c3c8b20) adding auth protocol: cephx
    -146> 2024-08-23T08:46:31.239-0600 7ff20250d180 5 AuthRegistry(0x7ffe8c3c8b20) adding auth protocol: cephx
    -145> 2024-08-23T08:46:31.239-0600 7ff20250d180 5 AuthRegistry(0x7ffe8c3c8b20) adding con mode: secure
    -144> 2024-08-23T08:46:31.239-0600 7ff20250d180 5 AuthRegistry(0x7ffe8c3c8b20) adding con mode: crc
    -143> 2024-08-23T08:46:31.239-0600 7ff20250d180 5 AuthRegistry(0x7ffe8c3c8b20) adding con mode: secure
    -142> 2024-08-23T08:46:31.239-0600 7ff20250d180 5 AuthRegistry(0x7ffe8c3c8b20) adding con mode: crc
    -141> 2024-08-23T08:46:31.239-0600 7ff20250d180 5 AuthRegistry(0x7ffe8c3c8b20) adding con mode: secure
    -140> 2024-08-23T08:46:31.239-0600 7ff20250d180 5 AuthRegistry(0x7ffe8c3c8b20) adding con mode: crc
    -139> 2024-08-23T08:46:31.239-0600 7ff20250d180 5 AuthRegistry(0x7ffe8c3c8b20) adding con mode: crc
    -138> 2024-08-23T08:46:31.239-0600 7ff20250d180 5 AuthRegistry(0x7ffe8c3c8b20) adding con mode: secure
    -137> 2024-08-23T08:46:31.239-0600 7ff20250d180 5 AuthRegistry(0x7ffe8c3c8b20) adding con mode: crc
    -136> 2024-08-23T08:46:31.239-0600 7ff20250d180 5 AuthRegistry(0x7ffe8c3c8b20) adding con mode: secure
    -135> 2024-08-23T08:46:31.239-0600 7ff20250d180 5 AuthRegistry(0x7ffe8c3c8b20) adding con mode: crc
    -134> 2024-08-23T08:46:31.239-0600 7ff20250d180 5 AuthRegistry(0x7ffe8c3c8b20) adding con mode: secure
    -133> 2024-08-23T08:46:31.239-0600 7ff20250d180 2 auth: KeyRing::load: loaded key file /var/lib/ceph/mgr/ceph-c1/keyring
    -132> 2024-08-23T08:46:31.239-0600 7ff20250d180 2 auth: KeyRing::load: loaded key file /var/lib/ceph/mgr/ceph-c1/keyring
    -131> 2024-08-23T08:46:31.239-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command rotate-key hook 0x7ffe8c3c8c68
    -130> 2024-08-23T08:46:31.239-0600 7ff20250d180 10 monclient: _reopen_session rank -1
    -129> 2024-08-23T08:46:31.239-0600 7ff20250d180 10 monclient: _add_conns ranks=[1,0]
    -128> 2024-08-23T08:46:31.239-0600 7ff20250d180 10 monclient(hunting): picked mon.noname-a con 0x5646a2e35000 addr v2:10.254.101.30:3300/0
    -127> 2024-08-23T08:46:31.239-0600 7ff20250d180 10 monclient(hunting): picked mon.noname-a-legacy con 0x5646a2e35400 addr v1:10.254.101.30:6789/0
    -126> 2024-08-23T08:46:31.239-0600 7ff20250d180 10 monclient(hunting): start opening mon connection
    -125> 2024-08-23T08:46:31.239-0600 7ff20250d180 10 monclient(hunting): _renew_subs
    -124> 2024-08-23T08:46:31.239-0600 7ff20250d180 10 monclient(hunting): authenticate will time out at 594.069698s
    -123> 2024-08-23T08:46:31.239-0600 7ff2006006c0 10 monclient(hunting): get_auth_request con 0x5646a2e35000 auth_method 0
    -122> 2024-08-23T08:46:31.239-0600 7ff2006006c0 10 monclient(hunting): get_auth_request method 2 preferred_modes [2,1]
    -121> 2024-08-23T08:46:31.239-0600 7ff2006006c0 10 monclient(hunting): _init_auth method 2
    -120> 2024-08-23T08:46:31.239-0600 7ff2006006c0 10 monclient(hunting): _init_auth creating new auth
    -119> 2024-08-23T08:46:31.239-0600 7ff2006006c0 10 monclient(hunting): handle_auth_reply_more payload 9
    -118> 2024-08-23T08:46:31.239-0600 7ff2006006c0 10 monclient(hunting): handle_auth_reply_more payload_len 9
    -117> 2024-08-23T08:46:31.239-0600 7ff2006006c0 10 monclient(hunting): handle_auth_reply_more responding with 36 bytes
    -116> 2024-08-23T08:46:31.240-0600 7ff2006006c0 10 monclient(hunting): handle_auth_done global_id 14571 payload 274
    -115> 2024-08-23T08:46:31.240-0600 7ff2006006c0 10 monclient: _finish_hunting 0
    -114> 2024-08-23T08:46:31.240-0600 7ff2006006c0 1 monclient: found mon.noname-a
    -113> 2024-08-23T08:46:31.240-0600 7ff2006006c0 10 monclient: _send_mon_message to mon.noname-a at v2:10.254.101.30:3300/0
    -112> 2024-08-23T08:46:31.240-0600 7ff1ff2006c0 10 monclient: discarding stray monitor message auth_reply(proto 2 0 (0) Success)
    -111> 2024-08-23T08:46:31.241-0600 7ff1ff2006c0 10 monclient: handle_monmap mon_map magic: 0
    -110> 2024-08-23T08:46:31.241-0600 7ff1ff2006c0 10 monclient: got monmap 4 from mon.noname-a (according to old e4)
    -109> 2024-08-23T08:46:31.241-0600 7ff1ff2006c0 10 monclient: dump:
    epoch 4
    fsid eba9a362-2d80-4860-94cc-f48b00a091bc
    last_changed 2024-08-23T08:42:20.033508-0600
    created 2024-08-23T08:42:10.072058-0600
    min_mon_release 19 (squid)
    election_strategy: 1
    0: [v2:10.254.101.30:3300/0,v1:10.254.101.30:6789/0] mon.c1
    1: [v2:10.254.101.31:3300/0,v1:10.254.101.31:6789/0] mon.c2
    2: [v2:10.254.101.32:3300/0,v1:10.254.101.32:6789/0] mon.c3

    -108> 2024-08-23T08:46:31.241-0600 7ff1ff2006c0 10 monclient: _finish_auth 0
    -107> 2024-08-23T08:46:31.241-0600 7ff1ff2006c0 10 monclient: _check_auth_tickets
    -106> 2024-08-23T08:46:31.241-0600 7ff1ff2006c0 10 monclient: _check_auth_rotating renewing rotating keys (they expired before 2024-08-23T08:46:01.241590-0600)
    -105> 2024-08-23T08:46:31.241-0600 7ff1ff2006c0 10 monclient: _send_mon_message to mon.c1 at v2:10.254.101.30:3300/0
    -104> 2024-08-23T08:46:31.241-0600 7ff1ff2006c0 10 monclient: handle_config config(0 keys)
    -103> 2024-08-23T08:46:31.241-0600 7ff1ff2006c0 10 monclient: handle_monmap mon_map magic: 0
    -102> 2024-08-23T08:46:31.241-0600 7ff1ff2006c0 10 monclient: got monmap 4 from mon.c1 (according to old e4)
    -101> 2024-08-23T08:46:31.241-0600 7ff1ff2006c0 10 monclient: dump:
    epoch 4
    fsid eba9a362-2d80-4860-94cc-f48b00a091bc
    last_changed 2024-08-23T08:42:20.033508-0600
    created 2024-08-23T08:42:10.072058-0600
    min_mon_release 19 (squid)
    election_strategy: 1
    0: [v2:10.254.101.30:3300/0,v1:10.254.101.30:6789/0] mon.c1
    1: [v2:10.254.101.31:3300/0,v1:10.254.101.31:6789/0] mon.c2
    2: [v2:10.254.101.32:3300/0,v1:10.254.101.32:6789/0] mon.c3
    -100> 2024-08-23T08:46:31.241-0600 7ff20250d180  5 monclient: authenticate success, global_id 14571
    -99> 2024-08-23T08:46:31.241-0600 7ff20250d180 10 monclient: get_monmap_and_config success
    -98> 2024-08-23T08:46:31.241-0600 7ff20250d180 4 set_mon_vals no callback set
    -97> 2024-08-23T08:46:31.241-0600 7ff1ff2006c0 10 monclient: _finish_auth 0
    -96> 2024-08-23T08:46:31.241-0600 7ff1ff2006c0 10 monclient: _check_auth_tickets
    -95> 2024-08-23T08:46:31.241-0600 7ff1ff2006c0 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2024-08-23T08:46:01.241691-0600)
    -94> 2024-08-23T08:46:31.241-0600 7ff20250d180 10 monclient: shutdown
    -93> 2024-08-23T08:46:31.241-0600 7ff20250d180 5 asok(0x5646a2efa000) unregister_commands rotate-key
    -92> 2024-08-23T08:46:31.241-0600 7ff20250d180 0 set uid:gid to 167:167 (ceph:ceph)
    -91> 2024-08-23T08:46:31.242-0600 7ff20250d180 0 ceph version 19.1.0 (9025b9024baf597d63005552b5ee004013630404) squid (rc), process ceph-mgr, pid 7847
    -90> 2024-08-23T08:46:31.242-0600 7ff20250d180 0 pidfile_write: ignore empty --pid-file
    -89> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 asok(0x5646a2efa000) init /var/run/ceph/ceph-mgr.c1.asok
    -88> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 asok(0x5646a2efa000) bind_and_listen /var/run/ceph/ceph-mgr.c1.asok
    -87> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command 0 hook 0x5646a2ead758
    -86> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command version hook 0x5646a2ead758
    -85> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command git_version hook 0x5646a2ead758
    -84> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command help hook 0x5646a2e28bb0
    -83> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command get_command_descriptions hook 0x5646a2e28c20
    -82> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command raise hook 0x5646a3bd8390
    -81> 2024-08-23T08:46:31.242-0600 7ff1ff2006c0 5 asok(0x5646a2efa000) entry start
    -80> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 AuthRegistry(0x5646a2e5b340) adding auth protocol: cephx
    -79> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 AuthRegistry(0x5646a2e5b340) adding auth protocol: cephx
    -78> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 AuthRegistry(0x5646a2e5b340) adding auth protocol: cephx
    -77> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 AuthRegistry(0x5646a2e5b340) adding con mode: secure
    -76> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 AuthRegistry(0x5646a2e5b340) adding con mode: crc
    -75> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 AuthRegistry(0x5646a2e5b340) adding con mode: secure
    -74> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 AuthRegistry(0x5646a2e5b340) adding con mode: crc
    -73> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 AuthRegistry(0x5646a2e5b340) adding con mode: secure
    -72> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 AuthRegistry(0x5646a2e5b340) adding con mode: crc
    -71> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 AuthRegistry(0x5646a2e5b340) adding con mode: crc
    -70> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 AuthRegistry(0x5646a2e5b340) adding con mode: secure
    -69> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 AuthRegistry(0x5646a2e5b340) adding con mode: crc
    -68> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 AuthRegistry(0x5646a2e5b340) adding con mode: secure
    -67> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 AuthRegistry(0x5646a2e5b340) adding con mode: crc
    -66> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 AuthRegistry(0x5646a2e5b340) adding con mode: secure
    -65> 2024-08-23T08:46:31.242-0600 7ff20250d180 2 auth: KeyRing::load: loaded key file /var/lib/ceph/mgr/ceph-c1/keyring
    -64> 2024-08-23T08:46:31.242-0600 7ff20250d180 10 monclient: build_initial_monmap
    -63> 2024-08-23T08:46:31.242-0600 7ff20250d180 1 build_initial for_mkfs: 0
    -62> 2024-08-23T08:46:31.242-0600 7ff20250d180 10 monclient: monmap:
    epoch 0
    fsid 00000000-0000-0000-0000-000000000000
    last_changed 0.000000
    created 0.000000
    min_mon_release 0 (unknown)
    election_strategy: 1
    0: [v2:10.254.101.30:3300/0,v1:10.254.101.30:6789/0] mon.noname-a
    1: [v2:10.254.101.31:3300/0,v1:10.254.101.31:6789/0] mon.noname-b
    2: [v2:10.254.101.32:3300/0,v1:10.254.101.32:6789/0] mon.noname-c
    -61> 2024-08-23T08:46:31.242-0600 7ff20250d180  4 mgr init Registered monc callback
    -60> 2024-08-23T08:46:31.242-0600 7ff20250d180 10 monclient: init
    -59> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 AuthRegistry(0x7ffe8c3ca3d8) adding auth protocol: cephx
    -58> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 AuthRegistry(0x7ffe8c3ca3d8) adding auth protocol: cephx
    -57> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 AuthRegistry(0x7ffe8c3ca3d8) adding auth protocol: cephx
    -56> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 AuthRegistry(0x7ffe8c3ca3d8) adding con mode: secure
    -55> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 AuthRegistry(0x7ffe8c3ca3d8) adding con mode: crc
    -54> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 AuthRegistry(0x7ffe8c3ca3d8) adding con mode: secure
    -53> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 AuthRegistry(0x7ffe8c3ca3d8) adding con mode: crc
    -52> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 AuthRegistry(0x7ffe8c3ca3d8) adding con mode: secure
    -51> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 AuthRegistry(0x7ffe8c3ca3d8) adding con mode: crc
    -50> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 AuthRegistry(0x7ffe8c3ca3d8) adding con mode: crc
    -49> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 AuthRegistry(0x7ffe8c3ca3d8) adding con mode: secure
    -48> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 AuthRegistry(0x7ffe8c3ca3d8) adding con mode: crc
    -47> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 AuthRegistry(0x7ffe8c3ca3d8) adding con mode: secure
    -46> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 AuthRegistry(0x7ffe8c3ca3d8) adding con mode: crc
    -45> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 AuthRegistry(0x7ffe8c3ca3d8) adding con mode: secure
    -44> 2024-08-23T08:46:31.242-0600 7ff20250d180 2 auth: KeyRing::load: loaded key file /var/lib/ceph/mgr/ceph-c1/keyring
    -43> 2024-08-23T08:46:31.242-0600 7ff20250d180 2 auth: KeyRing::load: loaded key file /var/lib/ceph/mgr/ceph-c1/keyring
    -42> 2024-08-23T08:46:31.242-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command rotate-key hook 0x7ffe8c3ca520
    -41> 2024-08-23T08:46:31.242-0600 7ff20250d180 10 monclient: _reopen_session rank -1
    -40> 2024-08-23T08:46:31.242-0600 7ff20250d180 10 monclient: _add_conns ranks=[1,0,2]
    -39> 2024-08-23T08:46:31.242-0600 7ff20250d180 10 monclient(hunting): picked mon.noname-b con 0x5646a2e35000 addr [v2:10.254.101.31:3300/0,v1:10.254.101.31:6789/0]
    -38> 2024-08-23T08:46:31.242-0600 7ff20250d180 10 monclient(hunting): picked mon.noname-a con 0x5646a2e35400 addr [v2:10.254.101.30:3300/0,v1:10.254.101.30:6789/0]
    -37> 2024-08-23T08:46:31.242-0600 7ff20250d180 10 monclient(hunting): picked mon.noname-c con 0x5646a3c36c00 addr [v2:10.254.101.32:3300/0,v1:10.254.101.32:6789/0]
    -36> 2024-08-23T08:46:31.242-0600 7ff20250d180 10 monclient(hunting): start opening mon connection
    -35> 2024-08-23T08:46:31.242-0600 7ff20250d180 10 monclient(hunting): start opening mon connection
    -34> 2024-08-23T08:46:31.242-0600 7ff20250d180 10 monclient(hunting): start opening mon connection
    -33> 2024-08-23T08:46:31.242-0600 7ff20250d180 10 monclient(hunting): _renew_subs
    -32> 2024-08-23T08:46:31.242-0600 7ff1ffc006c0 10 monclient(hunting): get_auth_request con 0x5646a2e35400 auth_method 0
    -31> 2024-08-23T08:46:31.242-0600 7ff1ffc006c0 10 monclient(hunting): get_auth_request method 2 preferred_modes [2,1]
    -30> 2024-08-23T08:46:31.242-0600 7ff1ffc006c0 10 monclient(hunting): _init_auth method 2
    -29> 2024-08-23T08:46:31.242-0600 7ff1ffc006c0 10 monclient(hunting): _init_auth creating new auth
    -28> 2024-08-23T08:46:31.242-0600 7ff1ffc006c0 10 monclient(hunting): handle_auth_reply_more payload 9
    -27> 2024-08-23T08:46:31.242-0600 7ff1ffc006c0 10 monclient(hunting): handle_auth_reply_more payload_len 9
    -26> 2024-08-23T08:46:31.242-0600 7ff1ffc006c0 10 monclient(hunting): handle_auth_reply_more responding with 36 bytes
    -25> 2024-08-23T08:46:31.242-0600 7ff1ffc006c0 10 monclient(hunting): handle_auth_done global_id 14577 payload 995
    -24> 2024-08-23T08:46:31.243-0600 7ff1ffc006c0 10 monclient: _finish_hunting 0
    -23> 2024-08-23T08:46:31.243-0600 7ff1ffc006c0 1 monclient: found mon.noname-a
    -22> 2024-08-23T08:46:31.243-0600 7ff1ffc006c0 10 monclient: _send_mon_message to mon.noname-a at v2:10.254.101.30:3300/0
    -21> 2024-08-23T08:46:31.243-0600 7ff2010006c0 10 monclient: get_auth_request con 0x5646a3c36c00 auth_method 0
    -20> 2024-08-23T08:46:31.243-0600 7ff1fc0006c0 10 monclient: handle_monmap mon_map magic: 0
    -19> 2024-08-23T08:46:31.243-0600 7ff1fc0006c0 10 monclient: got monmap 4 from mon.noname-a (according to old e4)
    -18> 2024-08-23T08:46:31.243-0600 7ff1fc0006c0 10 monclient: dump:
    epoch 4
    fsid eba9a362-2d80-4860-94cc-f48b00a091bc
    last_changed 2024-08-23T08:42:20.033508-0600
    created 2024-08-23T08:42:10.072058-0600
    min_mon_release 19 (squid)
    election_strategy: 1
    0: [v2:10.254.101.30:3300/0,v1:10.254.101.30:6789/0] mon.c1
    1: [v2:10.254.101.31:3300/0,v1:10.254.101.31:6789/0] mon.c2
    2: [v2:10.254.101.32:3300/0,v1:10.254.101.32:6789/0] mon.c3
    -17> 2024-08-23T08:46:31.243-0600 7ff1fc0006c0 10 monclient: _finish_auth 0
    -16> 2024-08-23T08:46:31.243-0600 7ff1fc0006c0 10 monclient: _check_auth_tickets
    -15> 2024-08-23T08:46:31.243-0600 7ff1fc0006c0 10 monclient: _check_auth_rotating renewing rotating keys (they expired before 2024-08-23T08:46:01.243664-0600)
    -14> 2024-08-23T08:46:31.243-0600 7ff1fc0006c0 10 monclient: _send_mon_message to mon.c1 at v2:10.254.101.30:3300/0
    -13> 2024-08-23T08:46:31.243-0600 7ff1fc0006c0 10 monclient: handle_config config(0 keys)
    -12> 2024-08-23T08:46:31.243-0600 7ff20250d180 5 monclient: authenticate success, global_id 14577
    -11> 2024-08-23T08:46:31.243-0600 7ff20250d180 10 log_channel(cluster) update_config to_monitors: true to_syslog: false syslog_facility: prio: info to_graylog: false graylog_host: 127.0.0.1 graylog_port: 12201)
    -10> 2024-08-23T08:46:31.243-0600 7ff20250d180 10 log_channel(audit) update_config to_monitors: true to_syslog: false syslog_facility: prio: info to_graylog: false graylog_host: 127.0.0.1 graylog_port: 12201)
    -9> 2024-08-23T08:46:31.243-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command objecter_requests hook 0x5646a3c3b410
    -8> 2024-08-23T08:46:31.243-0600 7ff20250d180 10 monclient: _renew_subs
    -7> 2024-08-23T08:46:31.243-0600 7ff20250d180 10 monclient: _send_mon_message to mon.c1 at v2:10.254.101.30:3300/0
    -6> 2024-08-23T08:46:31.243-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command mds_requests hook 0x7ffe8c3cbb90
    -5> 2024-08-23T08:46:31.243-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command mds_sessions hook 0x7ffe8c3cbb90
    -4> 2024-08-23T08:46:31.243-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command dump_cache hook 0x7ffe8c3cbb90
    -3> 2024-08-23T08:46:31.243-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command kick_stale_sessions hook 0x7ffe8c3cbb90
    -2> 2024-08-23T08:46:31.243-0600 7ff20250d180 5 asok(0x5646a2efa000) register_command status hook 0x7ffe8c3cbb90
    -1> 2024-08-23T08:46:31.250-0600 7ff20250d180 1 mgr[py] Loading python module 'alerts'
    0> 2024-08-23T08:46:31.292-0600 7ff20250d180 -1 ** Caught signal (Segmentation fault) *
    in thread 7ff20250d180 thread_name:ceph-mgr

    ceph version 19.1.0 (9025b9024baf597d63005552b5ee004013630404) squid (rc)
    1: /lib64/libc.so.6(+0x19dc0) [0x7ff204627dc0]
    2: /lib64/libpython3.13.so.1.0(+0x28a6fb) [0x7ff204a8a6fb]
    3: /lib64/libpython3.13.so.1.0(+0x2d042) [0x7ff20482d042]
    4: /lib64/libpython3.13.so.1.0(+0x254fba) [0x7ff204a54fba]
    5: /lib64/libpython3.13.so.1.0(+0x1739cb) [0x7ff2049739cb]
    6: _PyEval_EvalFrameDefault()
    7: /lib64/libpython3.13.so.1.0(+0x1700d8) [0x7ff2049700d8]
    8: PyObject_CallMethodObjArgs()
    9: PyImport_ImportModuleLevelObject()
    10: _PyEval_EvalFrameDefault()
    11: PyEval_EvalCode()
    12: /lib64/libpython3.13.so.1.0(+0x236cb4) [0x7ff204a36cb4]
    13: /lib64/libpython3.13.so.1.0(+0x168c47) [0x7ff204968c47]
    14: _PyEval_EvalFrameDefault()
    15: /lib64/libpython3.13.so.1.0(+0x1700d8) [0x7ff2049700d8]
    16: PyObject_CallMethodObjArgs()
    17: PyImport_ImportModuleLevelObject()
    18: _PyEval_EvalFrameDefault()
    19: PyEval_EvalCode()
    20: /lib64/libpython3.13.so.1.0(+0x236cb4) [0x7ff204a36cb4]
    21: /lib64/libpython3.13.so.1.0(+0x168c47) [0x7ff204968c47]
    22: _PyEval_EvalFrameDefault()
    23: /lib64/libpython3.13.so.1.0(+0x1700d8) [0x7ff2049700d8]
    24: PyObject_CallMethodObjArgs()
    25: PyImport_ImportModuleLevelObject()
    26: _PyEval_EvalFrameDefault()
    27: PyEval_EvalCode()
    28: /lib64/libpython3.13.so.1.0(+0x236cb4) [0x7ff204a36cb4]
    29: /lib64/libpython3.13.so.1.0(+0x168c47) [0x7ff204968c47]
    30: _PyEval_EvalFrameDefault()
    31: /lib64/libpython3.13.so.1.0(+0x1700d8) [0x7ff2049700d8]
    NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- logging levels ---
0/ 5 none
0/ 1 lockdep
0/ 1 context
1/ 1 crush
1/ 5 mds
1/ 5 mds_balancer
1/ 5 mds_locker
1/ 5 mds_log
1/ 5 mds_log_expire
1/ 5 mds_migrator
3/ 5 mds_quiesce
0/ 1 buffer
0/ 1 timer
0/ 1 filer
0/ 1 striper
0/ 1 objecter
0/ 5 rados
0/ 5 rbd
0/ 5 rbd_mirror
0/ 5 rbd_replay
0/ 5 rbd_pwl
0/ 5 journaler
0/ 5 objectcacher
0/ 5 immutable_obj_cache
0/ 5 client
1/ 5 osd
0/ 5 optracker
0/ 5 objclass
1/ 3 filestore
1/ 3 journal
0/ 0 ms
1/ 5 mon
0/10 monc
1/ 5 paxos
0/ 5 tp
1/ 5 auth
1/ 5 crypto
1/ 1 finisher
1/ 1 reserver
1/ 5 heartbeatmap
1/ 5 perfcounter
1/ 5 rgw
1/ 5 rgw_sync
1/ 5 rgw_datacache
1/ 5 rgw_access
1/ 5 rgw_dbstore
1/ 5 rgw_flight
1/ 5 rgw_lifecycle
1/ 5 javaclient
1/ 5 asok
1/ 1 throttle
0/ 0 refs
1/ 5 compressor
1/ 5 bluestore
1/ 5 bluefs
1/ 3 bdev
1/ 5 kstore
4/ 5 rocksdb
1/ 5 fuse
2/ 5 mgr
1/ 5 mgrc
1/ 5 dpdk
1/ 5 eventtrace
1/ 5 prioritycache
0/ 5 test
0/ 5 cephfs_mirror
0/ 5 cephsqlite
0/ 5 crimson_interrupt
0/ 5 seastore
0/ 5 seastore_onode
0/ 5 seastore_odata
0/ 5 seastore_omap
0/ 5 seastore_tm
0/ 5 seastore_t
0/ 5 seastore_cleaner
0/ 5 seastore_epm
0/ 5 seastore_lba
0/ 5 seastore_fixedkv_tree
0/ 5 seastore_cache
0/ 5 seastore_journal
0/ 5 seastore_device
0/ 5 seastore_backref
0/ 5 alienstore
1/ 5 mclock
0/ 5 cyanstore
1/ 5 ceph_exporter
1/ 5 memstore
1/ 5 trace
2/-2 (syslog threshold)
-1/-1 (stderr threshold)
--
pthread ID / name mapping for recent threads ---
7ff1fc0006c0 / ms_dispatch
7ff1ff2006c0 / admin_socket
7ff1ffc006c0 / msgr-worker-2
7ff2006006c0 / msgr-worker-1
7ff2010006c0 / msgr-worker-0
7ff20250d180 / ceph-mgr
max_recent 10000
max_new 1000
log_file /var/lib/ceph/crash/2024-08-23T14:46:31.292387Z_840d6b97-feec-467c-9311-fa076d10d498/log
--- end dump of recent events ---

Expected Results:
ceph-mgr should start managing the cluster instead of crashing.

  1. rpm -qa | grep ceph-mgr
    ceph-mgr-modules-core-19.1.0-0.6.fc41.noarch
    ceph-mgr-19.1.0-0.6.fc41.x86_64
    ceph-mgr-dashboard-19.1.0-0.6.fc41.noarch
    ceph-mgr-cephadm-19.1.0-0.6.fc41.noarch
    ceph-mgr-diskprediction-local-19.1.0-0.6.fc41.noarch
    ceph-mgr-k8sevents-19.1.0-0.6.fc41.noarch
    ceph-mgr-rook-19.1.0-0.6.fc41.noarch
    [root@c1 mgr]# rpm -qa | grep ceph
    libcephfs2-19.1.0-0.6.fc41.x86_64
    libcephsqlite-19.1.0-0.6.fc41.x86_64
    python3-ceph-common-19.1.0-0.6.fc41.x86_64
    python3-ceph-argparse-19.1.0-0.6.fc41.x86_64
    python3-cephfs-19.1.0-0.6.fc41.x86_64
    cephadm-19.1.0-0.6.fc41.noarch
    ceph-mgr-modules-core-19.1.0-0.6.fc41.noarch
    ceph-common-19.1.0-0.6.fc41.x86_64
    ceph-base-19.1.0-0.6.fc41.x86_64
    ceph-selinux-19.1.0-0.6.fc41.x86_64
    ceph-mgr-19.1.0-0.6.fc41.x86_64
    ceph-osd-19.1.0-0.6.fc41.x86_64
    ceph-mds-19.1.0-0.6.fc41.x86_64
    ceph-mon-19.1.0-0.6.fc41.x86_64
    ceph-prometheus-alerts-19.1.0-0.6.fc41.noarch
    ceph-grafana-dashboards-19.1.0-0.6.fc41.noarch
    ceph-mgr-dashboard-19.1.0-0.6.fc41.noarch
    ceph-19.1.0-0.6.fc41.x86_64
    ceph-volume-19.1.0-0.6.fc41.noarch
    ceph-mgr-cephadm-19.1.0-0.6.fc41.noarch
    ceph-mgr-diskprediction-local-19.1.0-0.6.fc41.noarch
    ceph-mgr-k8sevents-19.1.0-0.6.fc41.noarch
    ceph-mgr-rook-19.1.0-0.6.fc41.noarch
    ceph-radosgw-19.1.0-0.6.fc41.x86_64

For reference, the config files pulled down to configure ceph:
(note: all keys are test keys that are not in use anywhere aside from this test environment)

  1. curl -s http://10.254.101.1/ceph/ceph.conf
    [global]
    fsid = eba9a362-2d80-4860-94cc-f48b00a091bc
    mon_initial_members = c1
    mon_host = 10.254.101.30
    auth_cluster_required = cephx
    auth_service_required = cephx
    auth_client_required = cephx
    osd_pool_default_min_size = 1
    public_network = 10.254.101.0/24
  1. curl -s http://10.254.101.1/ceph/ceph.client.admin.keyring
    [client.admin]
    key = AQD/B8hm4tU+NxAA7rIUofLOYhAtLCPnqyxUXw==
    caps mds = "allow *"
    caps mgr = "allow *"
    caps mon = "allow *"
    caps osd = "allow *"
  1. curl -s http://10.254.101.1/ceph/ceph.mon.keyring
    [mon.]
    key = AQDxB8hmG8CbChAAcJpQx7ZmPeCNpkLr0A/UbA==
    caps mon = "allow *"
    [client.admin]
    key = AQD/B8hm4tU+NxAA7rIUofLOYhAtLCPnqyxUXw==
    caps mds = "allow *"
    caps mgr = "allow *"
    caps mon = "allow *"
    caps osd = "allow *"
    [client.bootstrap-osd]
    key = AQAGCMhmeVr0OBAAjnKCR8zrh7yH66kA8+L4/w==
    caps mgr = "allow r"
    caps mon = "profile bootstrap-osd"
Actions #1

Updated by Hector Martin 11 months ago

This has something to do with `ceph-mgr-diskprediction-local`. Removing that package allows ceph-mgr to boot.

```
2025-04-23T19:54:26.649+0900 fffff68e4040 -1 mgr[py] Module not found: 'diskprediction_local'

Thread 1 "ceph-mgr" received signal SIGSEGV, Segmentation fault.
Py_XINCREF (op=0x34) at /usr/include/python3.13/object.h:1034
warning: Source file is more recent than executable.
1034 }
(gdb) bt
#0 Py_XINCREF (op=0x34) at /usr/include/python3.13/object.h:1034
#1 PyArray_Item_INCREF (data=data@entry=0xffffea10fd00 "4", descr=descr@entry=0xffffdbc55c18 <OBJECT_Descr>) at ../numpy/core/src/multiarray/refcount.c:132
#2 0x0000ffffdba03a38 [PAC] in PyArray_FromScalar (scalar=<optimized out>, outcode=0x0) at ../numpy/core/src/multiarray/scalarapi.c:279
#3 0x0000ffffdb933cdc [PAC] in gentype_nonzero_number (m1=<optimized out>) at ../numpy/core/src/multiarray/scalartypes.c.src:269
#4 0x0000fffff78cd4d0 [PAC] in PyObject_IsTrue (v=<optimized out>) at /usr/src/debug/python3.13-3.13.2-1.fc41.aarch64/Objects/object.c:1901
#5 0x0000fffff78a34d0 [PAC] in _PyEval_EvalFrameDefault (tstate=0xaaaaac80f850, frame=0xfffff7d8c3a8, throwflag=0) at /usr/src/debug/python3.13-3.13.2-1.fc41.aarch64/Python/generated_cases.c.h:5879
#6 0x0000fffff78de590 [PAC] in _PyObject_VectorcallDictTstate (tstate=0xaaaaac80f850, callable=0xffffea3b9da0, args=0xffffffffa2b0, nargsf=4, kwargs=<optimized out>) at /usr/src/debug/python3.13-3.13.2-1.fc41.aarch64/Objects/call.c:146
#7 _PyObject_Call_Prepend (tstate=0xaaaaac80f850, callable=0xffffea3b9da0, obj=0xffffea0fdbe0, args=<optimized out>, kwargs=<optimized out>) at /usr/src/debug/python3.13-3.13.2-1.fc41.aarch64/Objects/call.c:504
#8 slot_tp_init (self=0xffffea0fdbe0, args=<optimized out>, kwds=<optimized out>) at /usr/src/debug/python3.13-3.13.2-1.fc41.aarch64/Objects/typeobject.c:9785
#9 0x0000fffff788656c [PAC] in type_call (self=0xaaaaaca13010, args=0xffffdbcef180, kwds=0xffffea74c800) at /usr/src/debug/python3.13-3.13.2-1.fc41.aarch64/Objects/typeobject.c:1999
#10 _PyObject_MakeTpCall (tstate=0xaaaaac80f850, callable=0xaaaaaca13010, args=<optimized out>, nargs=<optimized out>, keywords=0xffffea372fc0) at /usr/src/debug/python3.13-3.13.2-1.fc41.aarch64/Objects/call.c:242
#11 0x0000fffff78a8100 [PAC] in _PyEval_EvalFrameDefault (tstate=0xaaaaac80f850, frame=0xfffff7d8c020, throwflag=0) at /usr/src/debug/python3.13-3.13.2-1.fc41.aarch64/Python/generated_cases.c.h:1502
#12 0x0000fffff78814a4 [PAC] in _PyObject_VectorcallTstate (tstate=0xaaaaac80f850, callable=0xffffea3b8180, args=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>)
at /usr/src/debug/python3.13-3.13.2-1.fc41.aarch64/Include/internal/pycore_call.h:168
#13 0x0000fffff7880f90 [PAC] in _PyObject_CallFunctionVa (tstate=0xaaaaac80f850, callable=0xffffea3b8180, format=<optimized out>, va=...) at /usr/src/debug/python3.13-3.13.2-1.fc41.aarch64/Objects/call.c:546
#14 PyObject_CallFunction (callable=0xffffea3b8180, format=<optimized out>) at /usr/src/debug/python3.13-3.13.2-1.fc41.aarch64/Objects/call.c:574
#15 0x0000aaaaaad1e114 [PAC] in boost::python::call<boost::python::api::object, boost::python::handle<_object>, boost::python::handle<_object>, boost::python::handle<_object> > (a0=<synthetic pointer>..., a1=<synthetic pointer>...,
a2=<synthetic pointer>..., callable=0xffffea3b8180) at /usr/include/boost/python/call.hpp:62
#16 boost::python::api::object_operators<boost::python::api::object>::operator()<boost::python::handle<_object>, boost::python::handle<_object>, boost::python::handle<_object> > (this=0xffffffffa7c0, a0=<synthetic pointer>...,
a1=<synthetic pointer>..., a2=<synthetic pointer>...) at /usr/include/boost/python/object_call.hpp:19
#17 handle_pyerror (crash_dump=crash_dump@entry=true, module="diskprediction_local", caller="") at /usr/src/debug/ceph-19.2.2-1.fc41.aarch64/src/mgr/PyModule.cc:85
#18 0x0000aaaaaad22480 [PAC] in PyModule::load_subclass_of (this=this@entry=0xaaaaac668130, base_class=base_class@entry=0xaaaaaaec96c0 "MgrModule", py_class=py_class@entry=0xaaaaac668230)
at /usr/src/debug/ceph-19.2.2-1.fc41.aarch64/src/mgr/PyModule.cc:738
#19 0x0000aaaaaad22b48 [PAC] in PyModule::load (this=this@entry=0xaaaaac668130, pMainThreadState=<optimized out>) at /usr/src/debug/ceph-19.2.2-1.fc41.aarch64/src/mgr/PyModule.cc:388
#20 0x0000aaaaaad2bff0 [PAC] in PyModuleRegistry::init (this=<optimized out>) at /usr/src/debug/ceph-19.2.2-1.fc41.aarch64/src/mgr/PyModuleRegistry.cc:80
#21 0x0000aaaaaad01d14 [PAC] in MgrStandby::init (this=this@entry=0xffffffffc610) at /usr/src/debug/ceph-19.2.2-1.fc41.aarch64/src/mgr/MgrStandby.cc:204
#22 0x0000aaaaaab81e38 [PAC] in main (argc=8191999, argv=0xfffffffff268) at /usr/src/debug/ceph-19.2.2-1.fc41.aarch64/src/ceph_mgr.cc:69
```

Actions #2

Updated by Hector Martin 7 months ago

I'm pretty sure this is subinterpreters again. Poking around in gdb, it crashes while trying to format a Python exception. The exception itself is:

(gdb) p val
$1 = SystemError('/builddir/build/BUILD/python3.13-3.13.5-build/Python-3.13.5/Objects/structseq.c:693: bad argument to internal function',)

Which already suggests corrupted interpreter state. I managed to dig out a backtrace with gdb:

Traceback (most recent call first):
<built-in method create_dynamic of module object at remote 0xffffcc0fc8b0>
File "/usr/lib64/python3.13/site-packages/numpy/core/overrides.py", line 8, in <module>
from numpy.core._multiarray_umath import (
File "/usr/lib64/python3.13/site-packages/numpy/core/multiarray.py", line 10, in <module>
from . import overrides
File "/usr/lib64/python3.13/site-packages/numpy/core/__init__.py", line 24, in <module>
from . import multiarray
File "/usr/lib64/python3.13/site-packages/numpy/__config__.py", line 4, in <module>
from numpy.core._multiarray_umath import (
File "/usr/lib64/python3.13/site-packages/numpy/__init__.py", line 130, in <module>
from numpy.__config__ import show as show_config
File "/usr/lib64/python3.13/site-packages/scipy/__init__.py", line 50, in <module>
from numpy import show_config as show_numpy_config
File "/usr/share/ceph/mgr/diskprediction_local/module.py", line 16, in <module>
import scipy # noqa: ignore=F401
File "/usr/share/ceph/mgr/diskprediction_local/__init__.py", line 2, in <module>
from .module import Module

Googling around for that we get:

https://github.com/pybind/pybind11/issues/3112

Which then leads to:

https://github.com/numpy/numpy/issues/24755

The erroring function is PyStructSequence_InitType2, which is being called twice on the same static type. This function, used by numpy, is not compatible with subinterpreters: https://bugs.python.org/issue45113

What I don't understand yet is why this is happening. You'd think that if only the diskprediction subinterpreter is using numpy, then it wouldn't matter that other subinterpreters exist, as long as no other one uses numpy. gdb Python integration is extremely slow, so I'm waiting for several minutes to get a backtrace now... if I'm lucky I'll find something that allows for a workaround.

Actions #3

Updated by Hector Martin 7 months ago

Turns out this is a Python 3.13 regression caused by a deliberate change to how extension imports work. Filed at https://github.com/python/cpython/issues/138045

Actions

Also available in: Atom PDF