Project

General

Profile

Actions

Bug #72874

open

rados/thrash-old-clients Stuck doing _try_send injecting socket failure and then nothing for 8 hours

Added by Lee Sanders 7 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Tags (freeform):
Merge Commit:
Fixed In:
Released In:
Upkeep Timestamp:

Description

based on main

/a/yuriw-2025-08-19_14:49:40-rados-wip-yuri-testing-2025-08-18-1127-distro-default-smithi/8451624

https://pulpito.ceph.com/yuriw-2025-08-19_14:49:40-rados-wip-yuri-testing-2025-08-18-1127-distro-default-smithi/8451624/

2025-08-19T22:26:40.460 INFO::Aug 19 22:26:40 smithi084 ceph-mon32408: from='mgr.14152 172.21.15.84:0/3250360625' entity='mgr.y'
2025-08-19T22:26:40.460 INFO::Aug 19 22:26:40 smithi084 ceph-mon32408: from='mgr.14152 172.21.15.84:0/3250360625' entity='mgr.y' cmd={"prefix": "config rm", "who": "osd/host:smithi084", "name": "osd_memory_target"} : dispatch
2025-08-19T22:26:40.487 INFO:tasks.cephadm:Distributing conf and client.admin keyring to all hosts + 0755
2025-08-19T22:26:40.487 DEBUG:teuthology.orchestra.run.smithi084:> sudo /home/ubuntu/cephtest/cephadm --image quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph:33e05392b5ca8b60cc8542ac609af93108f3821b shell c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring --fsid 4076afd4-7d4b-11f0-8741-adfe0268badd - ceph orch client-keyring set client.admin '*' --mode 0755
2025-08-19T22:26:40.722 INFO:teuthology.orchestra.run.smithi084.stderr:Inferring config /var/lib/ceph/4076afd4-7d4b-11f0-8741-adfe0268badd/mon.a/config
2025-08-19T22:26:41.468 INFO::Aug 19 22:26:41 smithi084 ceph-mon32408: mgrmap e13: y(active, since 12s)
2025-08-19T22:27:14.719 INFO::Aug 19 22:27:14 smithi084 ceph-mon32408: Manager daemon y is unresponsive. No standby daemons available.
2025-08-19T22:27:14.720 INFO::Aug 19 22:27:14 smithi084 ceph-mon32408: osdmap e4: 0 total, 0 up, 0 in
2025-08-19T22:27:14.720 INFO::Aug 19 22:27:14 smithi084 ceph-mon32408: mgrmap e14: no daemons active (since 0.00719851s)
2025-08-19T23:02:54.433 INFO:teuthology.orchestra.run.smithi084.stderr:2025-08-19T23:02:54.424+0000 7f098e575640 0 -- 172.21.15.84:0/2445799563 >> [v2:172.21.15.84:3300/0,v1:172.21.15.84:6789/0] conn(0x7f0958060df0 msgr2=0x7f09580811d0 unknown :-1 s=STATE_CONNECTION_ESTABLISHED l=1)._try_send injecting socket failure
2025-08-19T23:08:41.471 INFO:teuthology.orchestra.run.smithi084.stderr:2025-08-19T23:08:41.466+0000 7f098e575640 0 -- 172.21.15.84:0/2445799563 >> [v2:172.21.15.84:3300/0,v1:172.21.15.84:6789/0] conn(0x7f0958060df0 msgr2=0x7f09580811d0 secure :-1 s=STATE_CONNECTION_ESTABLISHED l=1).read_until injecting socket failure
2025-08-20T00:25:34.991 INFO:teuthology.orchestra.run.smithi084.stderr:2025-08-20T00:25:34.984+0000 7f098e575640 0 -- 172.21.15.84:0/2445799563 >> [v2:172.21.15.84:3300/0,v1:172.21.15.84:6789/0] conn(0x7f099010c5e0 msgr2=0x7f09580a19a0 secure :-1 s=STATE_CONNECTION_ESTABLISHED l=1).read_until injecting socket failure
2025-08-20T01:12:45.315 INFO:teuthology.orchestra.run.smithi084.stderr:2025-08-20T01:12:45.308+0000 7f098e575640 0 -- 172.21.15.84:0/2445799563 >> [v2:172.21.15.84:3300/0,v1:172.21.15.84:6789/0] conn(0x7f0958060df0 msgr2=0x7f09580a22a0 secure :-1 s=STATE_CONNECTION_ESTABLISHED l=1)._try_send injecting socket failure
2025-08-20T04:31:16.711 INFO:teuthology.orchestra.run.smithi084.stderr:2025-08-20T04:31:16.706+0000 7f098e575640 0 -- 172.21.15.84:0/2445799563 >> [v2:172.21.15.84:3300/0,v1:172.21.15.84:6789/0] conn(0x7f0958060df0 msgr2=0x7f09580a3560 secure :-1 s=STATE_CONNECTION_ESTABLISHED l=1).read_until injecting socket failure
2025-08-20T04:33:56.728 INFO:teuthology.orchestra.run.smithi084.stderr:2025-08-20T04:33:56.723+0000 7f098e575640 0 -- 172.21.15.84:0/2445799563 >> [v2:172.21.15.84:3300/0,v1:172.21.15.84:6789/0] conn(0x7f099010c5e0 msgr2=0x7f09580a3020 secure :-1 s=STATE_CONNECTION_ESTABLISHED l=1)._try_send injecting socket failure
2025-08-20T06:14:11.765 DEBUG:teuthology.exit:Got signal 15; running 1 handler...
2025-08-20T06:14:11.783 DEBUG:teuthology.task.console_log:Killing console logger for smithi084
2025-08-20T06:14:11.785 DEBUG:teuthology.task.console_log:Killing console logger for smithi088
2025-08-20T06:14:11.785 DEBUG:teuthology.task.console_log:Killing console logger for smithi102
2025-08-20T06:14:11.786 DEBUG:teuthology.exit:Finished running handlers
~


Related issues 1 (1 open0 closed)

Is duplicate of Orchestrator - Bug #70247: Non-zero exit code 1 from systemctl reset-failed ceph-47356c0e-f761-11ef-bb88-bd4984dce30f@mon.aNew

Actions
Actions #1

Updated by Nitzan Mordechai 6 months ago

  • Is duplicate of Bug #70247: Non-zero exit code 1 from systemctl reset-failed ceph-47356c0e-f761-11ef-bb88-bd4984dce30f@mon.a added
Actions

Also available in: Atom PDF