Project

General

Profile

Actions

Bug #70765

closed

Single PG is not active+clean (backfill hang)

Added by Matan Breizman 12 months ago. Updated 12 months ago.

Status:
Duplicate
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Tags (freeform):
Merge Commit:
Fixed In:
Released In:
Upkeep Timestamp:

Description

https://pulpito.ceph.com/sjust-2025-04-01_07:07:33-crimson-rados-wip-sjust-crimson-testing-2025-03-31-1743478092-distro-default-gibba/8219244

Found when testing: https://github.com/ceph/ceph/pull/62288

pgid 6.5 is stuck in backfilling state:

HEALTH_WARN","summary":{"message":"256 slow ops, oldest one blocked for 2460 sec, osd.2 has slow ops","count":1} 

025-04-01T08:41:32.712 INFO:teuthology.orchestra.run.gibba029.stdout:{"fsid":"531c7267-5cab-4a52-b3c4-42ccffe77b46","health":{"status":"HEALTH_WARN","checks":{"PG_DEGRADED":{"severity":"HEALTH_WARN","summary":{"message":"Degraded data redundancy: 15/933 objects degraded (1.608%), 1 pg degraded, 1 pg undersized","count":2},"muted":false},"SLOW_OPS":{"severity":"HEALTH_WARN","summary":{"message":"256 slow ops, oldest one blocked for 2460 sec, osd.2 has slow ops","count":1},"muted":false}},"mutes":[]},"election_epoch":3,"quorum":[0],"quorum_names":["a"],"quorum_age":2870,"monmap":{"epoch":1,"min_mon_release_name":"tentacle","num_mons":1},"osdmap":{"epoch":140,"num_osds":4,"num_up_osds":4,"osd_up_since":1743494475,"num_in_osds":4,"osd_in_since":1743494024,"num_remapped_pgs":1},"pgmap":{"pgs_by_state":[{"state_name":"active+clean","count":24},{"state_name":"active+undersized+degraded+remapped+backfilling","count":1}],"num_pgs":25,"num_pools":3,"num_objects":311,"data_bytes":20685363,"bytes_used":157880320,"bytes_avail":386389176320,"bytes_total":386547056640,"degraded_objects":15,"degraded_total":933,"degraded_ratio":0.01607717041800643},"fsmap":{"epoch":1,"btime":"2025-04-01T07:53:42:173127+0000","by_rank":[],"up:standby":0},"mgrmap":{"available":true,"num_standbys":0,"modules":["iostat","nfs"],"services":{}},"servicemap":{"epoch":5,"modified":"2025-04-01T08:39:48.449259+0000","services":{}},"progress_events":{}}

2025-04-01T08:41:33.267 INFO:tasks.ceph.ceph_manager.ceph:PG 6.5 is not active+clean
2025-04-01T08:41:33.267 INFO:tasks.ceph.ceph_manager.ceph:{'pgid': '6.5', 'version': "119'117", 'reported_seq': 185, 'reported_epoch': 140, 'state': 'active+undersized+degraded+remapped+backfilling', 'last_fresh': '2025-04-01T08:01:27.248878+0000', 'last_change': '2025-04-01T08:00:34.593337+0000', 'last_active': '2025-04-01T08:01:27.248878+0000', 'last_peered': '2025-04-01T08:01:27.248878+0000', 'last_clean': '2025-04-01T08:00:25.231745+0000', 'last_became_active': '2025-04-01T08:00:32.305057+0000', 'last_became_peered': '2025-04-01T08:00:32.305057+0000', 'last_unstale': '2025-04-01T08:01:27.248878+0000', 'last_undegraded': '2025-04-01T08:00:32.834211+0000', 'last_fullsized': '2025-04-01T08:00:32.236978+0000', 'mapping_epoch': 121, 'log_start': "119'73", 'ondisk_log_start': "119'73", 'created': 116, 'last_epoch_clean': 0, 'parent': '0.0', 'parent_split_bits': 0, 'last_scrub': "0'0", 'last_scrub_stamp': '2025-04-01T08:00:25.231745+0000', 'last_deep_scrub': "0'0", 'last_deep_scrub_stamp': '2025-04-01T08:00:25.231745+0000', 'last_clean_scrub_stamp': '2025-04-01T08:00:25.231745+0000', 'objects_scrubbed': 0, 'log_size': 44, 'log_dups_size': 73, 'ondisk_log_size': 44, 'stats_invalid': False, 'dirty_stats_invalid': False, 'omap_stats_invalid': False, 'hitset_stats_invalid': False, 'hitset_bytes_stats_invalid': False, 'pin_stats_invalid': False, 'manifest_stats_invalid': False, 'snaptrimq_len': 0, 'last_scrub_duration': 0, 'scrub_schedule': '--', 'scrub_duration': 0, 'objects_trimmed': 0, 'snaptrim_duration': 0, 'stat_sum': {'num_bytes': 958464, 'num_objects': 15, 'num_object_clones': 0, 'num_object_copies': 45, 'num_objects_missing_on_primary': 0, 'num_objects_missing': 0, 'num_objects_degraded': 15, 'num_objects_misplaced': 0, 'num_objects_unfound': 0, 'num_objects_dirty': 0, 'num_whiteouts': 0, 'num_read': 0, 'num_read_kb': 0, 'num_write': 117, 'num_write_kb': 936, 'num_scrub_errors': 0, 'num_shallow_scrub_errors': 0, 'num_deep_scrub_errors': 0, 'num_objects_recovered': 3, 'num_bytes_recovered': 131072, 'num_keys_recovered': 0, 'num_objects_omap': 0, 'num_objects_hit_set_archive': 0, 'num_bytes_hit_set_archive': 0, 'num_flush': 0, 'num_flush_kb': 0, 'num_evict': 0, 'num_evict_kb': 0, 'num_promote': 0, 'num_flush_mode_high': 0, 'num_flush_mode_low': 0, 'num_evict_mode_some': 0, 'num_evict_mode_full': 0, 'num_objects_pinned': 0, 'num_legacy_snapsets': 0, 'num_large_omap_objects': 0, 'num_objects_manifest': 0, 'num_omap_bytes': 0, 'num_omap_keys': 0, 'num_objects_repaired': 0}, 'up': [2, 3, 0], 'acting': [2, 3], 'avail_no_missing': ['2', '3'], 'object_location_counts': [{'shards': '2,3', 'objects': 15}], 'blocked_by': [], 'up_primary': 2, 'acting_primary': 2, 'purged_snaps': []}
2025-04-01T08:41:33.268 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last):
  File "/home/teuthworker/src/github.com_ceph_ceph-c_285a99e52975a4e4219727322052d6b5714e03b7/qa/tasks/ceph_manager.py", line 192, in wrapper
    return func(self)
  File "/home/teuthworker/src/github.com_ceph_ceph-c_285a99e52975a4e4219727322052d6b5714e03b7/qa/tasks/ceph_manager.py", line 1488, in _do_thrash
    self.ceph_manager.wait_for_recovery(
  File "/home/teuthworker/src/github.com_ceph_ceph-c_285a99e52975a4e4219727322052d6b5714e03b7/qa/tasks/ceph_manager.py", line 3015, in wait_for_recovery
    assert now - start < timeout, \
AssertionError: wait_for_recovery: failed before timeout expired

Related issues 1 (0 open1 closed)

Is duplicate of crimson - Bug #70502: radosbench-high-concurrency: [Backfill] Single PG is stuck in waitingClosedMOHIT AGRAWAL

Actions
Actions #1

Updated by Matan Breizman 12 months ago

  • Is duplicate of Bug #70502: radosbench-high-concurrency: [Backfill] Single PG is stuck in waiting added
Actions #2

Updated by Matan Breizman 12 months ago

  • Status changed from New to Duplicate
Actions

Also available in: Atom PDF