Actions
Bug #66698
opendaemon_watchdog:BARK! but thrasher will continue looping until timeout
% Done:
0%
Source:
Backport:
squid,reef,quincy
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Tags (freeform):
backport_processed
Merge Commit:
Fixed In:
v19.3.0-3972-g58a668dba4
Released In:
v20.2.0~2301
Upkeep Timestamp:
2025-11-01T01:03:16+00:00
Description
/a/nmordech-2024-06-26_11:08:57-rados:thrash-erasure-code-isa-main-distro-default-smithi/7773254
2024-06-26T11:34:12.906 INFO:tasks.daemonwatchdog.daemon_watchdog:OSDThrasher failed 2024-06-26T11:34:12.906 INFO:tasks.daemonwatchdog.daemon_watchdog:BARK! unmounting mounts and killing all daemons . . 2024-06-26T11:34:14.925 INFO:tasks.ceph.osd.2.smithi081.stderr:2024-06-26T11:34:14.938+0000 7f62d9efb640 -1 received signal: Hangup from /usr/bin/python3 /usr/bin/daemon-helper kill ceph-osd -f --cluster ceph -i 2 (PID: 25394) UID: 0 2024-06-26T11:34:15.025 INFO:tasks.ceph.osd.0:Sent signal 15 2024-06-26T11:34:15.026 INFO:tasks.ceph.osd.4:Sent signal 15 2024-06-26T11:34:15.026 INFO:tasks.ceph.osd.8:Sent signal 15 2024-06-26T11:34:15.027 INFO:tasks.ceph.osd.12:Sent signal 15 2024-06-26T11:34:15.027 INFO:tasks.ceph.osd.1:Sent signal 15 2024-06-26T11:34:15.027 INFO:tasks.ceph.osd.5:Sent signal 15 2024-06-26T11:34:15.027 INFO:tasks.ceph.osd.9:Sent signal 15 2024-06-26T11:34:15.028 INFO:tasks.ceph.osd.13:Sent signal 15 2024-06-26T11:34:15.028 INFO:tasks.ceph.osd.2:Sent signal 15 2024-06-26T11:34:15.028 INFO:tasks.ceph.osd.6:Sent signal 15 2024-06-26T11:34:15.028 INFO:tasks.ceph.osd.10:Sent signal 15 2024-06-26T11:34:15.028 INFO:tasks.ceph.osd.14:Sent signal 15 2024-06-26T11:34:15.029 INFO:tasks.ceph.osd.3:Sent signal 15 2024-06-26T11:34:15.029 INFO:tasks.ceph.osd.7:Sent signal 15 2024-06-26T11:34:15.029 INFO:tasks.ceph.osd.11:Sent signal 15 2024-06-26T11:34:15.029 INFO:tasks.ceph.osd.15:Sent signal 15 2024-06-26T11:34:15.030 INFO:tasks.ceph.mon.a:Sent signal 15 2024-06-26T11:34:15.030 INFO:tasks.ceph.mon.b:Sent signal 15 2024-06-26T11:34:15.030 INFO:tasks.ceph.mon.c:Sent signal 15 2024-06-26T11:34:15.030 INFO:tasks.ceph.mgr.y:Sent signal 15 2024-06-26T11:34:15.031 INFO:tasks.ceph.mgr.x:Sent signal 15
all daemon terminated then we continue looping and trying to get op history:
2024-06-26T11:34:20.665 DEBUG:teuthology.orchestra.run.smithi016:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 30 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-osd.8.asok dump_ops_in_flight
2024-06-26T11:34:20.685 ERROR:teuthology.orchestra.daemon.state:Failed to send signal 1: None
Traceback (most recent call last):
File "/home/teuthworker/src/git.ceph.com_teuthology_544fecbcd55f3d2b6f478823823ce40cbefef1d4/teuthology/orchestra/daemon/state.py", line 108, in signal
self.proc.stdin.write(struct.pack('!b', sig))
File "/home/teuthworker/src/git.ceph.com_teuthology_544fecbcd55f3d2b6f478823823ce40cbefef1d4/virtualenv/lib/python3.10/site-packages/paramiko/file.py", line 389, in write
self._write_all(data)
File "/home/teuthworker/src/git.ceph.com_teuthology_544fecbcd55f3d2b6f478823823ce40cbefef1d4/virtualenv/lib/python3.10/site-packages/paramiko/file.py", line 507, in _write_all
count = self._write(data)
File "/home/teuthworker/src/git.ceph.com_teuthology_544fecbcd55f3d2b6f478823823ce40cbefef1d4/virtualenv/lib/python3.10/site-packages/paramiko/channel.py", line 1362, in _write
self.channel.sendall(data)
File "/home/teuthworker/src/git.ceph.com_teuthology_544fecbcd55f3d2b6f478823823ce40cbefef1d4/virtualenv/lib/python3.10/site-packages/paramiko/channel.py", line 844, in sendall
sent = self.send(s)
File "/home/teuthworker/src/git.ceph.com_teuthology_544fecbcd55f3d2b6f478823823ce40cbefef1d4/virtualenv/lib/python3.10/site-packages/paramiko/channel.py", line 799, in send
return self._send(s, m)
File "/home/teuthworker/src/git.ceph.com_teuthology_544fecbcd55f3d2b6f478823823ce40cbefef1d4/virtualenv/lib/python3.10/site-packages/paramiko/channel.py", line 1196, in _send
raise socket.error("Socket is closed")
OSError: Socket is closed
2024-06-26T11:34:20.787 ERROR:teuthology.orchestra.daemon.state:Failed to send signal 1: None
Traceback (most recent call last):
File "/home/teuthworker/src/git.ceph.com_teuthology_544fecbcd55f3d2b6f478823823ce40cbefef1d4/teuthology/orchestra/daemon/state.py", line 108, in signal
self.proc.stdin.write(struct.pack('!b', sig))
File "/home/teuthworker/src/git.ceph.com_teuthology_544fecbcd55f3d2b6f478823823ce40cbefef1d4/virtualenv/lib/python3.10/site-packages/paramiko/file.py", line 389, in write
self._write_all(data)
File "/home/teuthworker/src/git.ceph.com_teuthology_544fecbcd55f3d2b6f478823823ce40cbefef1d4/virtualenv/lib/python3.10/site-packages/paramiko/file.py", line 507, in _write_all
count = self._write(data)
File "/home/teuthworker/src/git.ceph.com_teuthology_544fecbcd55f3d2b6f478823823ce40cbefef1d4/virtualenv/lib/python3.10/site-packages/paramiko/channel.py", line 1362, in _write
self.channel.sendall(data)
File "/home/teuthworker/src/git.ceph.com_teuthology_544fecbcd55f3d2b6f478823823ce40cbefef1d4/virtualenv/lib/python3.10/site-packages/paramiko/channel.py", line 844, in sendall
sent = self.send(s)
File "/home/teuthworker/src/git.ceph.com_teuthology_544fecbcd55f3d2b6f478823823ce40cbefef1d4/virtualenv/lib/python3.10/site-packages/paramiko/channel.py", line 799, in send
return self._send(s, m)
File "/home/teuthworker/src/git.ceph.com_teuthology_544fecbcd55f3d2b6f478823823ce40cbefef1d4/virtualenv/lib/python3.10/site-packages/paramiko/channel.py", line 1196, in _send
raise socket.error("Socket is closed")
Updated by Nitzan Mordechai over 1 year ago
- Status changed from In Progress to Fix Under Review
- Pull request ID set to 58282
Updated by Radoslaw Zarzynski over 1 year ago
scrub note: approved and in the QA.
Updated by Radoslaw Zarzynski over 1 year ago
scrub note: under test -- see https://tracker.ceph.com/issues/66822.
Updated by Laura Flores over 1 year ago
Needs to resolve conflicts; otherwise was approved to merge.
Updated by Laura Flores over 1 year ago
- Status changed from Fix Under Review to Pending Backport
Updated by Upkeep Bot over 1 year ago
- Copied to Backport #67502: reef: daemon_watchdog:BARK! but thrasher will continue looping until timeout added
Updated by Upkeep Bot over 1 year ago
- Copied to Backport #67503: quincy: daemon_watchdog:BARK! but thrasher will continue looping until timeout added
Updated by Upkeep Bot over 1 year ago
- Copied to Backport #67504: squid: daemon_watchdog:BARK! but thrasher will continue looping until timeout added
Updated by Upkeep Bot over 1 year ago
- Tags (freeform) set to backport_processed
Updated by Upkeep Bot 9 months ago
- Merge Commit set to 58a668dba4f84a87593e375f79b813e3978deec6
- Fixed In set to v19.3.0-3972-g58a668dba4f
- Upkeep Timestamp set to 2025-07-08T22:38:11+00:00
Updated by Upkeep Bot 8 months ago
- Fixed In changed from v19.3.0-3972-g58a668dba4f to v19.3.0-3972-g58a668dba4f8
- Upkeep Timestamp changed from 2025-07-08T22:38:11+00:00 to 2025-07-14T15:46:44+00:00
Updated by Upkeep Bot 8 months ago
- Fixed In changed from v19.3.0-3972-g58a668dba4f8 to v19.3.0-3972-g58a668dba4
- Upkeep Timestamp changed from 2025-07-14T15:46:44+00:00 to 2025-07-14T21:10:45+00:00
Updated by Upkeep Bot 5 months ago
- Released In set to v20.2.0~2301
- Upkeep Timestamp changed from 2025-07-14T21:10:45+00:00 to 2025-11-01T01:03:16+00:00
Actions