Project

General

Profile

Actions

Bug #63089

closed

qa: tasks/mirror times out

Added by Venky Shankar over 2 years ago. Updated 5 months ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
Administration/Usability
Target version:
% Done:

0%

Source:
Q/A
Backport:
quincy,reef,squid
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
cephfs-mirror
Labels (FS):
Pull request ID:
Tags (freeform):
backport_processed
Fixed In:
v19.3.0-3451-g62eb72731a
Released In:
v20.2.0~2488
Upkeep Timestamp:
2025-11-01T01:27:12+00:00

Description

/a/vshankar-2023-09-28_07:23:59-fs-wip-vshankar-testing-20230926.081818-testing-default-smithi/7405363

2023-09-28T11:15:33.524 DEBUG:teuthology.orchestra.run.smithi105:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph fs mirror enable cephfs
2023-09-28T11:15:33.549 INFO:tasks.ceph.mgr.x.smithi105.stderr:2023-09-28T11:15:33.549+0000 7f1d69c56040 -1 mgr[py] Module zabbix has missing NOTIFY_TYPES member
2023-09-28T11:15:33.604 INFO:tasks.ceph.mgr.x.smithi105.stderr:2023-09-28T11:15:33.605+0000 7f1d69c56040 -1 mgr[py] Module balancer has missing NOTIFY_TYPES member
2023-09-28T11:15:33.657 INFO:tasks.ceph.mgr.x.smithi105.stderr:2023-09-28T11:15:33.657+0000 7f1d69c56040 -1 mgr[py] Module influx has missing NOTIFY_TYPES member
2023-09-28T11:15:33.721 INFO:tasks.ceph.mgr.x.smithi105.stderr:2023-09-28T11:15:33.721+0000 7f1d69c56040 -1 mgr[py] Module alerts has missing NOTIFY_TYPES member
2023-09-28T11:15:33.794 INFO:tasks.ceph.mgr.x.smithi105.stderr:2023-09-28T11:15:33.794+0000 7f1d69c56040 -1 mgr[py] Module iostat has missing NOTIFY_TYPES member
2023-09-28T11:15:33.935 INFO:tasks.ceph.mgr.x.smithi105.stderr:2023-09-28T11:15:33.935+0000 7f1d69c56040 -1 mgr[py] Module rgw has missing NOTIFY_TYPES member
2023-09-28T11:15:34.002 INFO:tasks.ceph.mgr.x.smithi105.stderr:2023-09-28T11:15:34.002+0000 7f1d69c56040 -1 mgr[py] Module rbd_support has missing NOTIFY_TYPES member
2023-09-28T11:15:34.056 INFO:tasks.ceph.mgr.x.smithi105.stderr:2023-09-28T11:15:34.056+0000 7f1d69c56040 -1 mgr[py] Module progress has missing NOTIFY_TYPES member
2023-09-28T11:15:34.118 INFO:tasks.ceph.mgr.x.smithi105.stderr:2023-09-28T11:15:34.118+0000 7f1d69c56040 -1 mgr[py] Module pg_autoscaler has missing NOTIFY_TYPES member
2023-09-28T11:15:34.172 INFO:tasks.ceph.mgr.x.smithi105.stderr:2023-09-28T11:15:34.172+0000 7f1d69c56040 -1 mgr[py] Module devicehealth has missing NOTIFY_TYPES member
2023-09-28T11:15:34.534 INFO:teuthology.orchestra.run:Running command with timeout 30
2023-09-28T11:15:34.534 DEBUG:teuthology.orchestra.run.smithi105:mirror status for fs: cephfs> ceph --admin-daemon /var/run/ceph/cephfs-mirror.asok fs mirror status cephfs@56
2023-09-28T11:15:34.572 INFO:tasks.ceph.mgr.x.smithi105.stderr:2023-09-28T11:15:34.572+0000 7f1d69c56040 -1 mgr[py] Module rook has missing NOTIFY_TYPES member
2023-09-28T11:15:34.726 INFO:teuthology.orchestra.run.smithi105.stderr:no valid command found; 1 closest matches:
2023-09-28T11:15:34.726 INFO:teuthology.orchestra.run.smithi105.stderr:fs mirror status cephfs@54
2023-09-28T11:15:34.726 INFO:teuthology.orchestra.run.smithi105.stderr:admin_socket: invalid command
2023-09-28T11:15:34.729 DEBUG:teuthology.orchestra.run:got remote process result: 22
2023-09-28T11:15:34.730 WARNING:tasks.cephfs.test_mirroring:mirror daemon command with label "mirror status for fs: cephfs" failed: Command failed (mirror status for fs: cephfs) on smithi105 with status 22: 'ceph --admin-daemon /var/run/ceph/cephfs-mirror.asok fs mirror status cephfs@56'

Related issues 4 (0 open4 closed)

Copied to CephFS - Backport #66969: squid: qa: tasks/mirror times outResolvedJos CollinActions
Copied to CephFS - Backport #66970: reef: qa: tasks/mirror times outResolvedJos CollinActions
Copied to CephFS - Backport #66971: quincy: qa: tasks/mirror times outResolvedJos CollinActions
Copied to CephFS - Backport #67145: quincy: qa: tasks/mirror times outDuplicateJos CollinActions
Actions #1

Updated by Venky Shankar over 2 years ago

  • Priority changed from Normal to Urgent
Actions #2

Updated by Venky Shankar over 2 years ago

Another instance, this time from reef branch: vshankar-2023-09-27_10:23:33-fs-wip-vshankar-testing-reef-20230927.021134-testing-default-smithi/7402858

From logs:

2023-09-27T13:52:11.070+0000 d0c7640 20 cephfs::mirror::Mirror schedule_mirror_update_task: scheduling fs mirror update (0x7083620) after 2 seconds
2023-09-27T13:52:11.071+0000 c8c6640 20 cephfs::mirror::FSMirror ~FSMirror
2023-09-27T13:52:11.071+0000 c8c6640 10 cephfs::mirror::Mirror enable_mirroring: starting FSMirror: filesystem={fscid=52, fs_name=cephfs}
2023-09-27T13:52:11.071+0000 c8c6640 10 cephfs::mirror::ServiceDaemon: 0x8fdf7e0 add_or_update_fs_attribute: fscid=52
2023-09-27T13:52:11.071+0000 c8c6640 10 cephfs::mirror::ServiceDaemon: 0x8fdf7e0 schedule_update_status
2023-09-27T13:52:11.071+0000 c8c6640 20 cephfs::mirror::FSMirror init
2023-09-27T13:52:11.071+0000 c8c6640 20 cephfs::mirror::Utils connect: connecting to cluster=ceph, client=client.mirror, mon_host=
2023-09-27T13:52:11.465+0000 c8c6640 10 cephfs::mirror::Utils connect: using mon addr=172.21.15.17
2023-09-27T13:52:12.071+0000 110cf640 20 cephfs::mirror::ServiceDaemon: 0x8fdf7e0 update_status: 1 filesystem(s)
2023-09-27T13:52:13.070+0000 d0c7640 20 cephfs::mirror::Mirror update_fs_mirrors
2023-09-27T13:52:22.110+0000 c8c6640 10 cephfs::mirror::Utils connect: connected to cluster=ceph using client=client.mirror
2023-09-27T13:52:22.169+0000 c8c6640 20 cephfs::mirror::Utils mount: filesystem={fscid=52, fs_name=cephfs}
2023-09-27T13:52:22.609+0000 c8c6640 10 cephfs::mirror::Utils mount: mounted filesystem={fscid=52, fs_name=cephfs}
2023-09-27T13:52:22.609+0000 c8c6640 10 cephfs::mirror::FSMirror init: rados addrs=172.21.15.17:0/3359552797
2023-09-27T13:52:22.609+0000 c8c6640 20 cephfs::mirror::FSMirror init_instance_watcher
2023-09-27T13:52:22.609+0000 c8c6640 20 cephfs::mirror::InstanceWatcher init
2023-09-27T13:52:22.609+0000 c8c6640 20 cephfs::mirror::InstanceWatcher create_instance

The daemon never returned from creating an instance object. Another observation is that the failures are with valgrind/

Actions #3

Updated by Patrick Donnelly almost 2 years ago

  • Target version changed from v19.0.0 to v20.0.0
Actions #4

Updated by Jos Collin over 1 year ago

  • Status changed from New to In Progress
Actions #5

Updated by Jos Collin over 1 year ago

Initially, I was looking for a valgrind failure. But as I checked deeper, this failure happens particularly for the tests 'test_mirroring_init_failure_with_recovery' and 'test_mirroring_init_failure' because the mirror daemon failed to restart after the tests marked the mirror daemon blocklisted/failed. The fix is already provided by https://github.com/ceph/ceph/pull/56193. I'm confirming this and will update the tracker soon.

Actions #6

Updated by Venky Shankar over 1 year ago

Jos Collin wrote in #note-5:

Initially, I was looking for a valgrind failure. But as I checked deeper, this failure happens particularly for the tests 'test_mirroring_init_failure_with_recovery' and 'test_mirroring_init_failure' because the mirror daemon failed to restart after the tests marked the mirror daemon blocklisted/failed. The fix is already provided by https://github.com/ceph/ceph/pull/56193. I'm confirming this and will update the tracker soon.

Thanks for checking. That PR is likely to get merged soon (hopefully today). Will notify when merged.

Actions #7

Updated by Jos Collin over 1 year ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 56193
Actions #8

Updated by Venky Shankar over 1 year ago

  • Status changed from Fix Under Review to Pending Backport
  • Source set to Q/A
  • Backport changed from reef,quincy to quincy,reef,squid
Actions #9

Updated by Jos Collin over 1 year ago

Actions #10

Updated by Jos Collin over 1 year ago

Actions #11

Updated by Jos Collin over 1 year ago

Actions #12

Updated by Upkeep Bot over 1 year ago

Actions #13

Updated by Upkeep Bot over 1 year ago

  • Tags (freeform) set to backport_processed
Actions #14

Updated by Jos Collin over 1 year ago

  • Status changed from Pending Backport to Resolved
Actions #15

Updated by Upkeep Bot 9 months ago

  • Merge Commit set to 62eb72731aca5d403ed6239946c6ea66f3be36e7
  • Fixed In set to v19.3.0-3451-g62eb72731ac
  • Upkeep Timestamp set to 2025-07-02T03:46:21+00:00
Actions #16

Updated by Upkeep Bot 8 months ago

  • Fixed In changed from v19.3.0-3451-g62eb72731ac to v19.3.0-3451-g62eb72731a
  • Upkeep Timestamp changed from 2025-07-02T03:46:21+00:00 to 2025-07-14T16:45:48+00:00
Actions #17

Updated by Upkeep Bot 5 months ago

  • Released In set to v20.2.0~2488
  • Upkeep Timestamp changed from 2025-07-14T16:45:48+00:00 to 2025-11-01T01:27:12+00:00
Actions

Also available in: Atom PDF