Project

General

Profile

Actions

Bug #72873

closed

rados/singleton-nomsgr - test_health_warnings.sh - PG 1.5 is not active+clean

Added by Lee Sanders 7 months ago. Updated 7 months ago.

Status:
Duplicate
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Tags (freeform):
Merge Commit:
Fixed In:
Released In:
Upkeep Timestamp:

Description

2025-08-19T23:02:37.502 INFO:tasks.workunit.client.0.smithi038.stderr:+ ceph health
2025-08-19T23:02:37.502 INFO:tasks.workunit.client.0.smithi038.stderr:+ grep down
2025-08-19T23:02:38.019 INFO:tasks.workunit.client.0.smithi038.stdout:HEALTH_WARN noout flag(s) set; 1 osds down; Degraded data redundancy: 1 pg undersized
2025-08-19T23:02:38.019 INFO:tasks.workunit.client.0.smithi038.stderr:+ sleep 1
2025-08-19T23:02:39.021 INFO:tasks.workunit.client.0.smithi038.stderr:+ ceph health
2025-08-19T23:02:39.021 INFO:tasks.workunit.client.0.smithi038.stderr:+ grep down
2025-08-19T23:02:39.553 INFO:tasks.workunit.client.0.smithi038.stdout:HEALTH_WARN noout flag(s) set; 1 osds down; Degraded data redundancy: 1 pg undersized
2025-08-19T23:02:39.553 INFO:tasks.workunit.client.0.smithi038.stderr:+ sleep 1
2025-08-19T23:02:40.555 INFO:tasks.workunit.client.0.smithi038.stderr:+ ceph health
2025-08-19T23:02:40.555 INFO:tasks.workunit.client.0.smithi038.stderr:+ grep down
2025-08-19T23:02:41.101 INFO:tasks.workunit.client.0.smithi038.stdout:HEALTH_WARN noout flag(s) set; 1 osds down; Degraded data redundancy: 1 pg undersized
2025-08-19T23:02:41.101 INFO:tasks.workunit.client.0.smithi038.stderr:+ sleep 1
2025-08-19T23:02:42.103 INFO:tasks.workunit.client.0.smithi038.stderr:+ ceph health
2025-08-19T23:02:42.103 INFO:tasks.workunit.client.0.smithi038.stderr:+ grep down
2025-08-19T23:02:42.568 DEBUG:teuthology.orchestra.run:got remote process result: 124
2025-08-19T23:02:42.571 INFO:tasks.workunit:Stopping ['rados/test_health_warnings.sh'] on client.0...
2025-08-19T23:02:42.571 DEBUG:teuthology.orchestra.run.smithi038:> sudo rm rf - /home/ubuntu/cephtest/workunits.list.client.0 /home/ubuntu/cephtest/clone.client.0
2025-08-19T23:02:42.890 ERROR:teuthology.run_tasks:Saw exception from tasks.
Traceback (most recent call last):
File "/home/teuthworker/src/git.ceph.com_teuthology_c9986ad400a1e25fcd55bdfe499ed247e51ac465/teuthology/run_tasks.py", line 105, in run_tasks
manager = run_one_task(taskname, ctx=ctx, config=config)
File "/home/teuthworker/src/git.ceph.com_teuthology_c9986ad400a1e25fcd55bdfe499ed247e51ac465/teuthology/run_tasks.py", line 83, in run_one_task
return task(**kwargs)
File "/home/teuthworker/src/github.com_ceph_ceph-c_33e05392b5ca8b60cc8542ac609af93108f3821b/qa/tasks/workunit.py", line 145, in task
_spawn_on_all_clients(ctx, refspec, all_tasks, config.get('env'),
File "/home/teuthworker/src/github.com_ceph_ceph-c_33e05392b5ca8b60cc8542ac609af93108f3821b/qa/tasks/workunit.py", line 293, in _spawn_on_all_clients
with parallel() as p:
File "/home/teuthworker/src/git.ceph.com_teuthology_c9986ad400a1e25fcd55bdfe499ed247e51ac465/teuthology/parallel.py", line 84, in exit
for result in self:
File "/home/teuthworker/src/git.ceph.com_teuthology_c9986ad400a1e25fcd55bdfe499ed247e51ac465/teuthology/parallel.py", line 98, in next
resurrect_traceback(result)
File "/home/teuthworker/src/git.ceph.com_teuthology_c9986ad400a1e25fcd55bdfe499ed247e51ac465/teuthology/parallel.py", line 30, in resurrect_traceback
raise exc.exc_info1
File "/home/teuthworker/src/git.ceph.com_teuthology_c9986ad400a1e25fcd55bdfe499ed247e51ac465/teuthology/parallel.py", line 23, in capture_traceback
return func(*args, **kwargs)
File "/home/teuthworker/src/github.com_ceph_ceph-c_33e05392b5ca8b60cc8542ac609af93108f3821b/qa/tasks/workunit.py", line 434, in _run_tests
remote.run(
File "/home/teuthworker/src/git.ceph.com_teuthology_c9986ad400a1e25fcd55bdfe499ed247e51ac465/teuthology/orchestra/remote.py", line 575, in run
r = self._runner(client=self.ssh, name=self.shortname, **kwargs)
File "/home/teuthworker/src/git.ceph.com_teuthology_c9986ad400a1e25fcd55bdfe499ed247e51ac465/teuthology/orchestra/run.py", line 461, in run
r.wait()
File "/home/teuthworker/src/git.ceph.com_teuthology_c9986ad400a1e25fcd55bdfe499ed247e51ac465/teuthology/orchestra/run.py", line 161, in wait
self._raise_for_status()
File "/home/teuthworker/src/git.ceph.com_teuthology_c9986ad400a1e25fcd55bdfe499ed247e51ac465/teuthology/orchestra/run.py", line 181, in _raise_for_status
raise CommandFailedError(
teuthology.exceptions.CommandFailedError: Command failed (workunit test rados/test_health_warnings.sh) on smithi038 with status 124: 'mkdir p - /home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=33e05392b5ca8b60cc8542ac609af93108f3821b TESTDIR="/home/ubuntu/cephtest" CEPH_ARGS="--cluster ceph" CEPH_ID="0" PATH=$PATH:/usr/sbin CEPH_BASE=/home/ubuntu/cephtest/clone.client.0 CEPH_ROOT=/home/ubuntu/cephtest/clone.client.0 CEPH_MNT=/home/ubuntu/cephtest/mnt.0 adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/clone.client.0/qa/workunits/rados/test_health_warnings.sh'

025-08-19T23:17:48.665 INFO:tasks.ceph.ceph_manager.ceph:no progress seen, keeping timeout for now
2025-08-19T23:17:52.680 INFO:tasks.ceph.ceph_manager.ceph:PG 1.5 is not active+clean
2025-08-19T23:17:52.680 INFO:tasks.ceph.ceph_manager.ceph:{'pgid': '1.5', 'version': "0'0", 'reported_seq': 167, 'reported_epoch': 50, 'state': 'active+undersized', 'last_fresh': '2025-08-19T23:15:22.580302+0000', 'last_change': '2025-08-19T20:03:08.632302+0000', 'last_active': '2025-08-19T23:15:22.580302+0000', 'last_peered': '2025-08-19T23:15:22.580302+0000', 'last_clean': '2025-08-19T20:03:07.269553+0000', 'last_became_active': '2025-08-19T20:03:08.632302+0000', 'last_became_peered': '2025-08-19T20:03:08.632302+0000', 'last_unstale': '2025-08-19T23:15:22.580302+0000', 'last_undegraded': '2025-08-19T23:15:22.580302+0000', 'last_fullsized': '2025-08-19T20:03:08.631561+0000', 'mapping_epoch': 42, 'log_start': "0'0", 'ondisk_log_start': "0'0", 'created': 18, 'last_epoch_clean': 40, 'parent': '0.0', 'parent_split_bits': 0, 'last_scrub': "0'0", 'last_scrub_stamp': '2025-08-19T20:01:47.487793+0000', 'last_deep_scrub': "0'0", 'last_deep_scrub_stamp': '2025-08-19T20:01:47.487793+0000', 'last_clean_scrub_stamp': '2025-08-19T20:01:47.487793+0000', 'objects_scrubbed': 0, 'log_size': 0, 'log_dups_size': 0, 'ondisk_log_size': 0, 'stats_invalid': False, 'dirty_stats_invalid': False, 'omap_stats_invalid': False, 'hitset_stats_invalid': False, 'hitset_bytes_stats_invalid': False, 'pin_stats_invalid': False, 'manifest_stats_invalid': False, 'snaptrimq_len': 0, 'last_scrub_duration': 0, 'scrub_schedule': 'periodic scrub scheduled @ 2025-08-20T21:00:15.382709+0000', 'scrub_duration': 0, 'objects_trimmed': 0, 'snaptrim_duration': 0.000557799, 'stat_sum': {'num_bytes': 0, 'num_objects': 0, 'num_object_clones': 0, 'num_object_copies': 0, 'num_objects_missing_on_primary': 0, 'num_objects_missing': 0, 'num_objects_degraded': 0, 'num_objects_misplaced': 0, 'num_objects_unfound': 0, 'num_objects_dirty': 0, 'num_whiteouts': 0, 'num_read': 0, 'num_read_kb': 0, 'num_write': 0, 'num_write_kb': 0, 'num_scrub_errors': 0, 'num_shallow_scrub_errors': 0, 'num_deep_scrub_errors': 0, 'num_objects_recovered': 0, 'num_bytes_recovered': 0, 'num_keys_recovered': 0, 'num_objects_omap': 0, 'num_objects_hit_set_archive': 0, 'num_bytes_hit_set_archive': 0, 'num_flush': 0, 'num_flush_kb': 0, 'num_evict': 0, 'num_evict_kb': 0, 'num_promote': 0, 'num_flush_mode_high': 0, 'num_flush_mode_low': 0, 'num_evict_mode_some': 0, 'num_evict_mode_full': 0, 'num_objects_pinned': 0, 'num_legacy_snapsets': 0, 'num_large_omap_objects': 0, 'num_objects_manifest': 0, 'num_omap_bytes': 0, 'num_omap_keys': 0, 'num_objects_repaired': 0}, 'up': [9], 'acting': [9], 'avail_no_missing': ['9'], 'object_location_counts': [], 'blocked_by': [], 'up_primary': 9, 'acting_primary': 9, 'purged_snaps': []}
2025-08-19T23:17:53.101 INFO:tasks.ceph.ceph_manager.ceph:no progress seen, keeping timeout for now
2025-08-19T23:17:57.149 INFO:tasks.ceph.ceph_manager.ceph:PG 1.5 is not active+clean

/a/yuriw-2025-08-19_14:49:40-rados-wip-yuri-testing-2025-08-18-1127-distro-default-smithi/8451410

https://pulpito.ceph.com/yuriw-2025-08-19_14:49:40-rados-wip-yuri-testing-2025-08-18-1127-distro-default-smithi/8451410/

Found on main

rados/singleton-nomsgr suite


Related issues 1 (1 open0 closed)

Is duplicate of RADOS - Bug #71931: rados/test_health_warnings.sh: OSD got stuck down and never came back upPending BackportNaveen Naidu

Actions
Actions #1

Updated by Nitzan Mordechai 7 months ago

  • Is duplicate of Bug #71931: rados/test_health_warnings.sh: OSD got stuck down and never came back up added
Actions #2

Updated by Nitzan Mordechai 7 months ago

  • Status changed from New to Duplicate
Actions

Also available in: Atom PDF