Project

General

Profile

Actions

Bug #72891

open

rados/thrash-erasure-code-overwrites ceph pg dump hung for 2 minutes

Added by Lee Sanders 7 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Tags (freeform):
Merge Commit:
Fixed In:
Released In:
Upkeep Timestamp:

Description

ceph pg dump command hung for 2 minutes and then timed out

2025-08-19T21:13:21.909 INFO:tasks.ceph.ceph_manager.ceph:need seq 90194313221 got 90194313218 for osd.14
2025-08-19T21:13:22.009 DEBUG:teuthology.orchestra.run.smithi059:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph osd last-stat-seq osd.1
2025-08-19T21:13:22.062 INFO:teuthology.orchestra.run.smithi059.stdout:0
2025-08-19T21:13:22.081 INFO:tasks.ceph.ceph_manager.ceph:need seq 94489280519 got 0 for osd.15
2025-08-19T21:13:22.127 INFO:teuthology.orchestra.run.smithi059.stderr:2025-08-19T21:13:22.114+0000 7f1c790b8640 -1 WARNING: all dangerous and experimental features are enabled.
2025-08-19T21:13:22.135 INFO:teuthology.orchestra.run.smithi059.stderr:2025-08-19T21:13:22.126+0000 7f1c790b8640 -1 WARNING: all dangerous and experimental features are enabled.
2025-08-19T21:13:22.323 INFO:teuthology.orchestra.run.smithi059.stdout:90194313218
2025-08-19T21:13:22.341 INFO:tasks.ceph.ceph_manager.ceph:need seq 90194313221 got 90194313218 for osd.1
2025-08-19T21:15:18.554 DEBUG:teuthology.orchestra.run:got remote process result: 124
2025-08-19T21:15:18.557 INFO:tasks.ceph.ceph_manager.ceph:waiting for clean
2025-08-19T21:15:18.557 DEBUG:teuthology.orchestra.run.smithi059:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph pg dump --format=json
2025-08-19T21:15:18.689 INFO:teuthology.orchestra.run.smithi059.stderr:2025-08-19T21:15:18.676+0000 7f3423e57640 -1 WARNING: all dangerous and experimental features are enabled.
2025-08-19T21:15:18.697 INFO:teuthology.orchestra.run.smithi059.stderr:2025-08-19T21:15:18.688+0000 7f3423e57640 -1 WARNING: all dangerous and experimental features are enabled.
2025-08-19T21:17:18.582 DEBUG:teuthology.orchestra.run:got remote process result: 124
2025-08-19T21:17:18.585 ERROR:teuthology.contextutil:Saw exception from nested tasks
Traceback (most recent call last):
File "/home/teuthworker/src/github.com_ceph_ceph-c_33e05392b5ca8b60cc8542ac609af93108f3821b/qa/tasks/ceph.py", line 1999, in task
healthy(ctx=ctx, config=dict(cluster=config['cluster']))
File "/home/teuthworker/src/github.com_ceph_ceph-c_33e05392b5ca8b60cc8542ac609af93108f3821b/qa/tasks/ceph.py", line 1554, in healthy
manager.wait_for_clean()
File "/home/teuthworker/src/github.com_ceph_ceph-c_33e05392b5ca8b60cc8542ac609af93108f3821b/qa/tasks/ceph_manager.py", line 2925, in wait_for_clean
num_active_clean = self.get_num_active_clean()
File "/home/teuthworker/src/github.com_ceph_ceph-c_33e05392b5ca8b60cc8542ac609af93108f3821b/qa/tasks/ceph_manager.py", line 2704, in get_num_active_clean
pgs = self.get_pg_stats()
File "/home/teuthworker/src/github.com_ceph_ceph-c_33e05392b5ca8b60cc8542ac609af93108f3821b/qa/tasks/ceph_manager.py", line 2470, in get_pg_stats
out = self.raw_cluster_cmd('pg', 'dump', '--format=json')
File "/home/teuthworker/src/github.com_ceph_ceph-c_33e05392b5ca8b60cc8542ac609af93108f3821b/qa/tasks/ceph_manager.py", line 1702, in raw_cluster_cmd
return self.run_cluster_cmd(*kwargs).stdout.getvalue()
File "/home/teuthworker/src/github.com_ceph_ceph-c_33e05392b5ca8b60cc8542ac609af93108f3821b/qa/tasks/ceph_manager.py", line 1693, in run_cluster_cmd
return self.controller.run(
*kwargs)
File "/home/teuthworker/src/git.ceph.com_teuthology_c9986ad400a1e25fcd55bdfe499ed247e51ac465/teuthology/orchestra/remote.py", line 575, in run
r = self._runner(client=self.ssh, name=self.shortname, **kwargs)
File "/home/teuthworker/src/git.ceph.com_teuthology_c9986ad400a1e25fcd55bdfe499ed247e51ac465/teuthology/orchestra/run.py", line 461, in run
r.wait()
File "/home/teuthworker/src/git.ceph.com_teuthology_c9986ad400a1e25fcd55bdfe499ed247e51ac465/teuthology/orchestra/run.py", line 161, in wait
self._raise_for_status()
File "/home/teuthworker/src/git.ceph.com_teuthology_c9986ad400a1e25fcd55bdfe499ed247e51ac465/teuthology/orchestra/run.py", line 181, in _raise_for_status
raise CommandFailedError(
teuthology.exceptions.CommandFailedError: Command failed on smithi059 with status 124: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph pg dump --format=json'

/a/yuriw-2025-08-19_14:49:40-rados-wip-yuri-testing-2025-08-18-1127-distro-default-smithi/8451510
https://pulpito.ceph.com/yuriw-2025-08-19_14:49:40-rados-wip-yuri-testing-2025-08-18-1127-distro-default-smithi/8451510

Found on main

No data to display

Actions

Also available in: Atom PDF