Bug #72874
openrados/thrash-old-clients Stuck doing _try_send injecting socket failure and then nothing for 8 hours
0%
Description
based on main
/a/yuriw-2025-08-19_14:49:40-rados-wip-yuri-testing-2025-08-18-1127-distro-default-smithi/8451624
2025-08-19T22:26:40.460 INFO:journalctl@ceph.mon.a.smithi084.stdout:Aug 19 22:26:40 smithi084 ceph-mon32408: from='mgr.14152 172.21.15.84:0/3250360625' entity='mgr.y'
2025-08-19T22:26:40.460 INFO:journalctl@ceph.mon.a.smithi084.stdout:Aug 19 22:26:40 smithi084 ceph-mon32408: from='mgr.14152 172.21.15.84:0/3250360625' entity='mgr.y' cmd={"prefix": "config rm", "who": "osd/host:smithi084", "name": "osd_memory_target"} : dispatch
2025-08-19T22:26:40.487 INFO:tasks.cephadm:Distributing conf and client.admin keyring to all hosts + 0755
2025-08-19T22:26:40.487 DEBUG:teuthology.orchestra.run.smithi084:> sudo /home/ubuntu/cephtest/cephadm --image quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph:33e05392b5ca8b60cc8542ac609af93108f3821b shell c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring --fsid 4076afd4-7d4b-11f0-8741-adfe0268badd - ceph orch client-keyring set client.admin '*' --mode 0755
2025-08-19T22:26:40.722 INFO:teuthology.orchestra.run.smithi084.stderr:Inferring config /var/lib/ceph/4076afd4-7d4b-11f0-8741-adfe0268badd/mon.a/config
2025-08-19T22:26:41.468 INFO:journalctl@ceph.mon.a.smithi084.stdout:Aug 19 22:26:41 smithi084 ceph-mon32408: mgrmap e13: y(active, since 12s)
2025-08-19T22:27:14.719 INFO:journalctl@ceph.mon.a.smithi084.stdout:Aug 19 22:27:14 smithi084 ceph-mon32408: Manager daemon y is unresponsive. No standby daemons available.
2025-08-19T22:27:14.720 INFO:journalctl@ceph.mon.a.smithi084.stdout:Aug 19 22:27:14 smithi084 ceph-mon32408: osdmap e4: 0 total, 0 up, 0 in
2025-08-19T22:27:14.720 INFO:journalctl@ceph.mon.a.smithi084.stdout:Aug 19 22:27:14 smithi084 ceph-mon32408: mgrmap e14: no daemons active (since 0.00719851s)
2025-08-19T23:02:54.433 INFO:teuthology.orchestra.run.smithi084.stderr:2025-08-19T23:02:54.424+0000 7f098e575640 0 -- 172.21.15.84:0/2445799563 >> [v2:172.21.15.84:3300/0,v1:172.21.15.84:6789/0] conn(0x7f0958060df0 msgr2=0x7f09580811d0 unknown :-1 s=STATE_CONNECTION_ESTABLISHED l=1)._try_send injecting socket failure
2025-08-19T23:08:41.471 INFO:teuthology.orchestra.run.smithi084.stderr:2025-08-19T23:08:41.466+0000 7f098e575640 0 -- 172.21.15.84:0/2445799563 >> [v2:172.21.15.84:3300/0,v1:172.21.15.84:6789/0] conn(0x7f0958060df0 msgr2=0x7f09580811d0 secure :-1 s=STATE_CONNECTION_ESTABLISHED l=1).read_until injecting socket failure
2025-08-20T00:25:34.991 INFO:teuthology.orchestra.run.smithi084.stderr:2025-08-20T00:25:34.984+0000 7f098e575640 0 -- 172.21.15.84:0/2445799563 >> [v2:172.21.15.84:3300/0,v1:172.21.15.84:6789/0] conn(0x7f099010c5e0 msgr2=0x7f09580a19a0 secure :-1 s=STATE_CONNECTION_ESTABLISHED l=1).read_until injecting socket failure
2025-08-20T01:12:45.315 INFO:teuthology.orchestra.run.smithi084.stderr:2025-08-20T01:12:45.308+0000 7f098e575640 0 -- 172.21.15.84:0/2445799563 >> [v2:172.21.15.84:3300/0,v1:172.21.15.84:6789/0] conn(0x7f0958060df0 msgr2=0x7f09580a22a0 secure :-1 s=STATE_CONNECTION_ESTABLISHED l=1)._try_send injecting socket failure
2025-08-20T04:31:16.711 INFO:teuthology.orchestra.run.smithi084.stderr:2025-08-20T04:31:16.706+0000 7f098e575640 0 -- 172.21.15.84:0/2445799563 >> [v2:172.21.15.84:3300/0,v1:172.21.15.84:6789/0] conn(0x7f0958060df0 msgr2=0x7f09580a3560 secure :-1 s=STATE_CONNECTION_ESTABLISHED l=1).read_until injecting socket failure
2025-08-20T04:33:56.728 INFO:teuthology.orchestra.run.smithi084.stderr:2025-08-20T04:33:56.723+0000 7f098e575640 0 -- 172.21.15.84:0/2445799563 >> [v2:172.21.15.84:3300/0,v1:172.21.15.84:6789/0] conn(0x7f099010c5e0 msgr2=0x7f09580a3020 secure :-1 s=STATE_CONNECTION_ESTABLISHED l=1)._try_send injecting socket failure
2025-08-20T06:14:11.765 DEBUG:teuthology.exit:Got signal 15; running 1 handler...
2025-08-20T06:14:11.783 DEBUG:teuthology.task.console_log:Killing console logger for smithi084
2025-08-20T06:14:11.785 DEBUG:teuthology.task.console_log:Killing console logger for smithi088
2025-08-20T06:14:11.785 DEBUG:teuthology.task.console_log:Killing console logger for smithi102
2025-08-20T06:14:11.786 DEBUG:teuthology.exit:Finished running handlers
~
Updated by Nitzan Mordechai 6 months ago
- Is duplicate of Bug #70247: Non-zero exit code 1 from systemctl reset-failed ceph-47356c0e-f761-11ef-bb88-bd4984dce30f@mon.a added