Skip to content

rbd-mirror: prevent image deletion if remote image is not primary#63270

Merged
idryomov merged 1 commit intoceph:mainfrom
VinayBhaskar-V:wip-resync-changes
Jul 29, 2025
Merged

rbd-mirror: prevent image deletion if remote image is not primary#63270
idryomov merged 1 commit intoceph:mainfrom
VinayBhaskar-V:wip-resync-changes

Conversation

@VinayBhaskar-V
Copy link
Copy Markdown
Contributor

@VinayBhaskar-V VinayBhaskar-V commented May 13, 2025

A resync on a mirrored image may incorrectly results in the local image being deleted even when the remote image is no longer primary.
This issue can occur under the following conditions:

  • if resync is requested on the secondary before the remote image has
    been fully demoted
  • if the demotion of the primary image is not mirrored
    due to the rbd-mirror daemon being offline.

This can be fixed by ensuring that image deletion during a resync is only allowed when the remote image is confirmed to be primary.

This PR fixes the issue only for snapshot based mirroring mode.

Fixes: https://tracker.ceph.com/issues/70948

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands

@VinayBhaskar-V VinayBhaskar-V requested a review from a team as a code owner May 13, 2025 20:35
@VinayBhaskar-V VinayBhaskar-V force-pushed the wip-resync-changes branch 2 times, most recently from 4cc46e3 to 358ebc1 Compare May 30, 2025 09:25
@VinayBhaskar-V
Copy link
Copy Markdown
Contributor Author

jenkins test make check

Copy link
Copy Markdown
Contributor

@idryomov idryomov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to add corresponding tests where is_remote_primary() would return false. For example, ResyncRequestedRemoteNotPrimary to src/test/rbd_mirror/image_replayer/journal/test_mock_PrepareReplayRequest.cc, InitResyncRequestedRemoteNotPrimary to src/test/rbd_mirror/image_replayer/journal/test_mock_Replayer.cc, etc. Place them after *ResyncRequested tests.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Jun 4, 2025

Config Diff Tool Output

- removed: rgw_bucket_eexist_override
- removed: breakpad
! changed: rgw_sts_key: old: Key used for encrypting/ decrypting role session tokens. This key must consist of 16 hexadecimal characters, which can be generated by the command 'openssl rand -hex 16'. All radosgw instances in a zone should use the same key. In multisite configurations, all zones in a realm should use the same key.
! changed: rgw_sts_key: new: Key used for encrypting/ decrypting session token.
! changed: osd_scrub_min_interval: old: The desired interval between scrubs of a specific PG. Note that this option must be set at ``global`` scope, or for both ``mgr`` and``osd``.
! changed: osd_scrub_min_interval: new: The desired interval between scrubs of a specific PG.
! changed: osd_deep_scrub_interval_cv: old: The coefficient of variation for the deep scrub interval, specified as a ratio. On average, the next deep scrub for a PG is scheduled osd_deep_scrub_interval after the last deep scrub . The actual time is randomized to a normal distribution with a standard deviation of osd_deep_scrub_interval * osd_deep_scrub_interval_cv (clamped to within 2 standard deviations). The default value guarantees that 95% of deep scrubs will be scheduled in the range [0.8 * osd_deep_scrub_interval, 1.2 * osd_deep_scrub_interval].
! changed: osd_deep_scrub_interval_cv: new: The coefficient of variation for the deep scrub interval, specified as a ratio. On average, the next deep scrub for a PG is scheduled osd_deep_scrub_interval after the last deep scrub . The actual time is randomized to a normal distribution with a standard deviation of osd_deep_scrub_interval * osd_deep_scrub_interval_cv (clamped to within 2 standard deviations). The default value guarantees that 95% of the deep scrubs will be scheduled in the range [0.8 * osd_deep_scrub_interval, 1.2 * osd_deep_scrub_interval].
! changed: osd_deep_scrub_interval_cv: old: deep scrub intervals are varied by a random amount to prevent stampedes. This parameter determines the amount of variation. Technically ``osd_deep_scrub_interval_cv`` is the coefficient of variation for the deep scrub interval.
! changed: osd_deep_scrub_interval_cv: new: deep scrub intervals are varied by a random amount to prevent stampedes. This parameter determines the amount of variation. Technically - osd_deep_scrub_interval_cv is the coefficient of variation for the deep scrub interval.
! changed: osd_deep_scrub_interval: old: Deep scrub each PG (i.e., verify data checksums) at least this often. Note that this option must be set at ``global`` scope, or for both ``mgr`` and``osd``.
! changed: osd_deep_scrub_interval: new: Deep scrub each PG (i.e., verify data checksums) at least this often
! changed: osd_scrub_max_interval: old: Scrub each PG no less often than this interval. Note that this option must be set at ``global`` scope, or for both ``mgr`` and``osd``.
! changed: osd_scrub_max_interval: new: Scrub each PG no less often than this interval

The above configuration changes are found in the PR. Please update the relevant release documentation if necessary.

wait_for_status_in_pool_dir ${CLUSTER1} ${POOL} ${test_resync_image} 'up+unknown' 'remote image is not primary'
promote_image ${CLUSTER1} ${POOL} ${test_resync_image}
compare_images ${CLUSTER1} ${CLUSTER2} ${POOL} ${POOL} ${test_resync_image}
remove_image_retry ${CLUSTER1} ${POOL} ${test_resync_image}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be fixed by ensuring that image deletion during a resync is only allowed when the remote image is confirmed to be primary.

If the primary image is deleted during a resync, what guarantees do we have that the secondary gets cleaned up properly and doesn't remain orphaned?

Does this logic also works against deleting a parent image of a clone during resync (especially with deep cloning support) ?

If a primary image is deleted without any resync operation involved, does it automatically trigger deletion of the corresponding secondary image as well?
make sure it shouldn't affect the existing workflow

Copy link
Copy Markdown
Contributor

@idryomov idryomov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM!

@idryomov
Copy link
Copy Markdown
Contributor

idryomov commented Jun 9, 2025

We appear to have an issue with OMAP cleanup on rbd_mirror_leader object. An image map entry remains behind on cluster1:

2025-06-09T17:14:00.058 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:2025-06-09 17:14:00 TEST: check if removed images' OMAP are removed 
...
2025-06-09T17:14:53.846 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ for s in 0 1 2 2 4 4 8 8 8 16 16 32
2025-06-09T17:14:53.846 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ sleep 16
2025-06-09T17:15:09.848 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ set +e
2025-06-09T17:15:09.849 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ count_omap_keys_with_filter cluster1 mirror rbd_mirror_leader image_map
2025-06-09T17:15:09.849 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ local cluster=cluster1
2025-06-09T17:15:09.849 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ local pool=mirror
2025-06-09T17:15:09.849 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ local obj_name=rbd_mirror_leader
2025-06-09T17:15:09.850 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ local filter=image_map
2025-06-09T17:15:09.850 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ list_omap_keys cluster1 mirror rbd_mirror_leader
2025-06-09T17:15:09.850 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ local cluster=cluster1
2025-06-09T17:15:09.850 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ local pool=mirror
2025-06-09T17:15:09.850 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ local obj_name=rbd_mirror_leader
2025-06-09T17:15:09.850 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ grep -c image_map
2025-06-09T17:15:09.850 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ rados --cluster cluster1 -p mirror listomapkeys rbd_mirror_leader
2025-06-09T17:15:09.899 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ test 1 = 0
2025-06-09T17:15:09.900 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ error_code=1
2025-06-09T17:15:09.900 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ set -e
2025-06-09T17:15:09.900 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ '[' 1 -eq 0 ']'
2025-06-09T17:15:09.900 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ for s in 0 1 2 2 4 4 8 8 8 16 16 32
2025-06-09T17:15:09.900 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ sleep 32
2025-06-09T17:15:41.902 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ set +e
2025-06-09T17:15:41.903 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ count_omap_keys_with_filter cluster1 mirror rbd_mirror_leader image_map
2025-06-09T17:15:41.904 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ local cluster=cluster1
2025-06-09T17:15:41.904 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ local pool=mirror
2025-06-09T17:15:41.904 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ local obj_name=rbd_mirror_leader
2025-06-09T17:15:41.904 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ local filter=image_map
2025-06-09T17:15:41.904 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ list_omap_keys cluster1 mirror rbd_mirror_leader
2025-06-09T17:15:41.904 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ local cluster=cluster1
2025-06-09T17:15:41.904 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ local pool=mirror
2025-06-09T17:15:41.904 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ local obj_name=rbd_mirror_leader                                                                                   
2025-06-09T17:15:41.904 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ rados --cluster cluster1 -p mirror listomapkeys rbd_mirror_leader
2025-06-09T17:15:41.905 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ grep -c image_map
2025-06-09T17:15:41.949 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ test 1 = 0
2025-06-09T17:15:41.949 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ error_code=1
2025-06-09T17:15:41.949 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ set -e
2025-06-09T17:15:41.949 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ '[' 1 -eq 0 ']'
2025-06-09T17:15:41.949 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ return 1
2025-06-09T17:15:41.949 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ cleanup 1
...
2025-06-09T17:15:43.571 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ echo 'cluster1 mirror  rbd_mirror_leader omap vals'                                     
2025-06-09T17:15:43.572 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:cluster1 mirror  rbd_mirror_leader omap vals                                                        
2025-06-09T17:15:43.572 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ rados --cluster cluster1 -p mirror --namespace '' listomapvals rbd_mirror_leader                                
2025-06-09T17:15:43.608 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:image_map_56d4cc3b-da44-410d-a276-95b3806a190c                                                                        
2025-06-09T17:15:43.608 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:value (36 bytes) :                                                                    
2025-06-09T17:15:43.609 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:00000000  01 01 1e 00 00 00 04 00  00 00 36 30 37 39 2d 16  |..........6079-.|
2025-06-09T17:15:43.609 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:00000010  47 68 c3 29 e3 2e 0a 00  00 00 01 01 04 00 00 00  |Gh.)............|
2025-06-09T17:15:43.609 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:00000020  00 00 00 00                                       |....|
2025-06-09T17:15:43.609 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:00000024   

The image in question is the one created by the new test:

2025-06-09T17:11:22.719 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:2025-06-09 17:11:22 TEST: request image resync when remote is not primary
...
2025-06-09T17:11:23.606 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ rbd --cluster=cluster2 info mirror/test_resync_image
2025-06-09T17:11:23.654 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:rbd image 'test_resync_image':
2025-06-09T17:11:23.655 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:    size 128 MiB in 32 objects
2025-06-09T17:11:23.655 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:    order 22 (4 MiB objects)
2025-06-09T17:11:23.655 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:    snapshot_count: 1
2025-06-09T17:11:23.655 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:    id: 16eb5a544837
2025-06-09T17:11:23.655 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:    block_name_prefix: rbd_data.16eb5a544837
2025-06-09T17:11:23.656 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:    format: 2
2025-06-09T17:11:23.656 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:    features: layering, exclusive-lock
2025-06-09T17:11:23.656 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:    op_features:
2025-06-09T17:11:23.656 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:    flags:
2025-06-09T17:11:23.656 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:    create_timestamp: Mon Jun  9 17:11:22 2025
2025-06-09T17:11:23.657 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:    access_timestamp: Mon Jun  9 17:11:22 2025
2025-06-09T17:11:23.657 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:    modify_timestamp: Mon Jun  9 17:11:22 2025
2025-06-09T17:11:23.657 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:    mirroring state: enabled
2025-06-09T17:11:23.657 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:    mirroring mode: snapshot
2025-06-09T17:11:23.657 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:    mirroring global id: 56d4cc3b-da44-410d-a276-95b3806a190c
2025-06-09T17:11:23.657 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:    mirroring primary: true

This happens both for snapshot-based and journal-based mirroring. Example jobs:

https://pulpito.ceph.com/dis-2025-06-09_10:16:26-rbd-wip-dis-testing-distro-default-smithi/8316587 (snapshot-based)
https://pulpito.ceph.com/dis-2025-06-09_10:16:26-rbd-wip-dis-testing-distro-default-smithi/8316666 (journal-based)

@VinayBhaskar-V This should be easy to reproduce, could you please investigate?

@VinayBhaskar-V
Copy link
Copy Markdown
Contributor Author

We appear to have an issue with OMAP cleanup on rbd_mirror_leader object. An image map entry remains behind on cluster1:

2025-06-09T17:14:00.058 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:2025-06-09 17:14:00 TEST: check if removed images' OMAP are removed 
...
2025-06-09T17:14:53.846 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ for s in 0 1 2 2 4 4 8 8 8 16 16 32
2025-06-09T17:14:53.846 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ sleep 16
2025-06-09T17:15:09.848 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ set +e
2025-06-09T17:15:09.849 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ count_omap_keys_with_filter cluster1 mirror rbd_mirror_leader image_map
2025-06-09T17:15:09.849 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ local cluster=cluster1
2025-06-09T17:15:09.849 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ local pool=mirror
2025-06-09T17:15:09.849 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ local obj_name=rbd_mirror_leader
2025-06-09T17:15:09.850 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ local filter=image_map
2025-06-09T17:15:09.850 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ list_omap_keys cluster1 mirror rbd_mirror_leader
2025-06-09T17:15:09.850 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ local cluster=cluster1
2025-06-09T17:15:09.850 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ local pool=mirror
2025-06-09T17:15:09.850 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ local obj_name=rbd_mirror_leader
2025-06-09T17:15:09.850 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ grep -c image_map
2025-06-09T17:15:09.850 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ rados --cluster cluster1 -p mirror listomapkeys rbd_mirror_leader
2025-06-09T17:15:09.899 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ test 1 = 0
2025-06-09T17:15:09.900 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ error_code=1
2025-06-09T17:15:09.900 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ set -e
2025-06-09T17:15:09.900 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ '[' 1 -eq 0 ']'
2025-06-09T17:15:09.900 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ for s in 0 1 2 2 4 4 8 8 8 16 16 32
2025-06-09T17:15:09.900 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ sleep 32
2025-06-09T17:15:41.902 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ set +e
2025-06-09T17:15:41.903 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ count_omap_keys_with_filter cluster1 mirror rbd_mirror_leader image_map
2025-06-09T17:15:41.904 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ local cluster=cluster1
2025-06-09T17:15:41.904 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ local pool=mirror
2025-06-09T17:15:41.904 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ local obj_name=rbd_mirror_leader
2025-06-09T17:15:41.904 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ local filter=image_map
2025-06-09T17:15:41.904 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ list_omap_keys cluster1 mirror rbd_mirror_leader
2025-06-09T17:15:41.904 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ local cluster=cluster1
2025-06-09T17:15:41.904 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ local pool=mirror
2025-06-09T17:15:41.904 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ local obj_name=rbd_mirror_leader                                                                                   
2025-06-09T17:15:41.904 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ rados --cluster cluster1 -p mirror listomapkeys rbd_mirror_leader
2025-06-09T17:15:41.905 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:++ grep -c image_map
2025-06-09T17:15:41.949 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ test 1 = 0
2025-06-09T17:15:41.949 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ error_code=1
2025-06-09T17:15:41.949 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ set -e
2025-06-09T17:15:41.949 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ '[' 1 -eq 0 ']'
2025-06-09T17:15:41.949 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ return 1
2025-06-09T17:15:41.949 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ cleanup 1
...
2025-06-09T17:15:43.571 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ echo 'cluster1 mirror  rbd_mirror_leader omap vals'                                     
2025-06-09T17:15:43.572 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:cluster1 mirror  rbd_mirror_leader omap vals                                                        
2025-06-09T17:15:43.572 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ rados --cluster cluster1 -p mirror --namespace '' listomapvals rbd_mirror_leader                                
2025-06-09T17:15:43.608 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:image_map_56d4cc3b-da44-410d-a276-95b3806a190c                                                                        
2025-06-09T17:15:43.608 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:value (36 bytes) :                                                                    
2025-06-09T17:15:43.609 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:00000000  01 01 1e 00 00 00 04 00  00 00 36 30 37 39 2d 16  |..........6079-.|
2025-06-09T17:15:43.609 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:00000010  47 68 c3 29 e3 2e 0a 00  00 00 01 01 04 00 00 00  |Gh.)............|
2025-06-09T17:15:43.609 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:00000020  00 00 00 00                                       |....|
2025-06-09T17:15:43.609 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:00000024   

The image in question is the one created by the new test:

2025-06-09T17:11:22.719 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:2025-06-09 17:11:22 TEST: request image resync when remote is not primary
...
2025-06-09T17:11:23.606 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stderr:+ rbd --cluster=cluster2 info mirror/test_resync_image
2025-06-09T17:11:23.654 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:rbd image 'test_resync_image':
2025-06-09T17:11:23.655 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:    size 128 MiB in 32 objects
2025-06-09T17:11:23.655 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:    order 22 (4 MiB objects)
2025-06-09T17:11:23.655 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:    snapshot_count: 1
2025-06-09T17:11:23.655 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:    id: 16eb5a544837
2025-06-09T17:11:23.655 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:    block_name_prefix: rbd_data.16eb5a544837
2025-06-09T17:11:23.656 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:    format: 2
2025-06-09T17:11:23.656 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:    features: layering, exclusive-lock
2025-06-09T17:11:23.656 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:    op_features:
2025-06-09T17:11:23.656 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:    flags:
2025-06-09T17:11:23.656 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:    create_timestamp: Mon Jun  9 17:11:22 2025
2025-06-09T17:11:23.657 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:    access_timestamp: Mon Jun  9 17:11:22 2025
2025-06-09T17:11:23.657 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:    modify_timestamp: Mon Jun  9 17:11:22 2025
2025-06-09T17:11:23.657 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:    mirroring state: enabled
2025-06-09T17:11:23.657 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:    mirroring mode: snapshot
2025-06-09T17:11:23.657 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:    mirroring global id: 56d4cc3b-da44-410d-a276-95b3806a190c
2025-06-09T17:11:23.657 INFO:tasks.workunit.cluster1.client.mirror.smithi195.stdout:    mirroring primary: true

This happens both for snapshot-based and journal-based mirroring. Example jobs:

https://pulpito.ceph.com/dis-2025-06-09_10:16:26-rbd-wip-dis-testing-distro-default-smithi/8316587 (snapshot-based) https://pulpito.ceph.com/dis-2025-06-09_10:16:26-rbd-wip-dis-testing-distro-default-smithi/8316666 (journal-based)

@VinayBhaskar-V This should be easy to reproduce, could you please investigate?

I checked it and i think this is happening because in the current test case i had written i am not waiting for the promotion to complete before removing image on cluster1, which results in deleting the image on only on cluster1, and the image on cluster2 remains in an up+error state.
I will update the testcase and will push the changes once rbd_mirror.sh runs successfully in my local.

@VinayBhaskar-V VinayBhaskar-V force-pushed the wip-resync-changes branch 2 times, most recently from a1beec1 to c55abef Compare June 10, 2025 10:49
@idryomov
Copy link
Copy Markdown
Contributor

The rerun with the amended test looks better, but I still got two related failures (and some seemingly unrelated, will have to look at them separately):

https://pulpito.ceph.com/dis-2025-06-11_18:54:43-rbd-wip-dis-testing-distro-default-smithi/8322945
https://pulpito.ceph.com/dis-2025-06-11_18:54:43-rbd-wip-dis-testing-distro-default-smithi/8322958

Here are excerpts from the first job:

2025-06-11T20:22:52.741 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ wait_for_status_in_pool_dir cluster2 mirror test_resync_image up+unknown 'remote image is not primary'
...
2025-06-11T20:24:20.514 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ for s in 1 2 4 8 8 8 8 8 8 8 8 16 16
2025-06-11T20:24:20.514 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ sleep 16
2025-06-11T20:24:36.515 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ test_status_in_pool_dir cluster2 mirror test_resync_image up+unknown 'remote image is not primary' ''
2025-06-11T20:24:36.515 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ local cluster=cluster2
2025-06-11T20:24:36.516 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ local pool=mirror
2025-06-11T20:24:36.516 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ local image=test_resync_image
2025-06-11T20:24:36.516 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ local state_pattern=up+unknown
2025-06-11T20:24:36.516 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ local 'description_pattern=remote image is not primary'
2025-06-11T20:24:36.516 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ local service_pattern=
2025-06-11T20:24:36.516 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ local status
2025-06-11T20:24:36.516 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:++ CEPH_ARGS=
2025-06-11T20:24:36.516 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:++ rbd --cluster cluster2 mirror image status mirror/test_resync_image
2025-06-11T20:24:36.580 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ status='test_resync_image:
2025-06-11T20:24:36.580 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:  global_id:   71aae58d-b2c4-412d-b54b-fbfdafcc1094
2025-06-11T20:24:36.580 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:  state:       up+error
2025-06-11T20:24:36.581 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:  description: error bootstrapping replay
2025-06-11T20:24:36.581 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:  service:     mirror.3 on smithi196
2025-06-11T20:24:36.581 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:  last_update: 2025-06-11 20:24:13
2025-06-11T20:24:36.581 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:  peer_sites:
2025-06-11T20:24:36.581 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:    name: cluster1
2025-06-11T20:24:36.581 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:    state: up+unknown
2025-06-11T20:24:36.581 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:    description: remote image is not primary
2025-06-11T20:24:36.581 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:    last_update: 2025-06-11 20:24:34'
2025-06-11T20:24:36.581 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ grep '^  state: .*up+unknown'
2025-06-11T20:24:36.583 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ return 1
2025-06-11T20:24:36.583 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ return 1
2025-06-11T20:24:36.583 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ cleanup 1

From the respective rbd-mirror daemon on cluster2:

2025-06-11T20:24:43.176+0000 7fe12d0e3640 10 rbd::mirror::ImageReplayer: 0x55a82177fb80 [3/71aae58d-b2c4-412d-b54b-fbfdafcc1094] set_state_description: r=0, desc=bootstrapping, PREPARE_REMOTE_IMAGE
2025-06-11T20:24:43.176+0000 7fe12d0e3640 15 rbd::mirror::ImageReplayer: 0x55a82177fb80 [3/71aae58d-b2c4-412d-b54b-fbfdafcc1094] update_mirror_image_status: force=0, state=--
2025-06-11T20:24:43.176+0000 7fe12d0e3640 10 rbd::mirror::image_replayer::PrepareRemoteImageRequest: 0x55a821efc700 get_remote_image_id:
2025-06-11T20:24:43.176+0000 7fe12d0e3640 15 rbd::mirror::ImageReplayer: 0x55a82177fb80 [3/71aae58d-b2c4-412d-b54b-fbfdafcc1094] set_mirror_image_status_update: force=0, state=--
2025-06-11T20:24:43.176+0000 7fe12d0e3640 15 rbd::mirror::ImageReplayer: 0x55a82177fb80 [3/71aae58d-b2c4-412d-b54b-fbfdafcc1094] reregister_admin_socket_hook: old_image_spec=mirror/71aae58d-b2c4-412d-b54b-fbfdafcc1094, new_image_spec=mirror/test_resync_image
2025-06-11T20:24:43.176+0000 7fe12d0e3640 15 rbd::mirror::ImageReplayer: 0x55a82177fb80 [3/71aae58d-b2c4-412d-b54b-fbfdafcc1094] unregister_admin_socket_hook:
2025-06-11T20:24:43.176+0000 7fe12d0e3640 15 rbd::mirror::ImageReplayer: 0x55a82177fb80 [3/71aae58d-b2c4-412d-b54b-fbfdafcc1094] register_admin_socket_hook: registered asok hook: mirror/test_resync_image
2025-06-11T20:24:43.176+0000 7fe12d0e3640 15 rbd::mirror::ImageReplayer: 0x55a82177fb80 [3/71aae58d-b2c4-412d-b54b-fbfdafcc1094] set_mirror_image_status_update: status={state=up+error, description=starting replay, last_update=0.000000]}
2025-06-11T20:24:43.176+0000 7fe12d0e3640 15 rbd::mirror::MirrorStatusUpdater 0x55a82171e000 set_mirror_image_status: global_image_id=71aae58d-b2c4-412d-b54b-fbfdafcc1094, mirror_image_site_status={state=up+error, description=starting replay, last_update=0.000000]}
2025-06-11T20:24:43.176+0000 7fe12d0e3640 15 rbd::mirror::MirrorStatusUpdater 0x55a82171e5a0 set_mirror_image_status: global_image_id=71aae58d-b2c4-412d-b54b-fbfdafcc1094, mirror_image_site_status={state=up+error, description=starting replay, last_update=0.000000]}
2025-06-11T20:24:43.176+0000 7fe1208ca640 10 rbd::mirror::image_replayer::GetMirrorImageIdRequest: 0x55a821f02120 handle_get_image_id: global image 71aae58d-b2c4-412d-b54b-fbfdafcc1094 not registered
2025-06-11T20:24:43.176+0000 7fe1208ca640 10 rbd::mirror::image_replayer::PrepareRemoteImageRequest: 0x55a821efc700 handle_get_remote_image_id: r=-2, remote_image_id=
2025-06-11T20:24:43.176+0000 7fe1208ca640 10 rbd::mirror::image_replayer::PrepareRemoteImageRequest: 0x55a821efc700 finish: r=-2
2025-06-11T20:24:43.176+0000 7fe1208ca640 10 rbd::mirror::image_replayer::BootstrapRequest: 0x55a821dc5380 handle_prepare_remote_image: r=-2
2025-06-11T20:24:43.176+0000 7fe1208ca640 10 rbd::mirror::image_replayer::BootstrapRequest: 0x55a821dc5380 handle_prepare_remote_image: remote image does not exist: local_image_id=1ab4b6c81e08, remote_image_id=, is_linked=0
2025-06-11T20:24:43.176+0000 7fe1208ca640 10 rbd::mirror::ImageReplayer: 0x55a82177fb80 [3/71aae58d-b2c4-412d-b54b-fbfdafcc1094] handle_bootstrap: r=-2
2025-06-11T20:24:43.176+0000 7fe1208ca640 10 rbd::mirror::ImageReplayer: 0x55a82177fb80 [3/71aae58d-b2c4-412d-b54b-fbfdafcc1094] on_start_fail: r=-2, desc=error bootstrapping replay

From the respective rbd-mirror daemon on cluster1:

2025-06-11T20:22:52.719+0000 7f537621e640 20 librbd::Journal: 0x556477b9d080 check_resync_requested
2025-06-11T20:22:52.719+0000 7f537621e640 10 rbd::mirror::image_replayer::journal::Replayer: 0x556477ed8380 handle_resync_image:
2025-06-11T20:22:52.719+0000 7f537621e640 10 rbd::mirror::image_replayer::journal::Replayer: 0x556477ed8380 handle_replay_complete: r=0, error=resync requested
2025-06-11T20:22:52.719+0000 7f537621e640 10 rbd::mirror::image_replayer::journal::Replayer: 0x556477ed8380 notify_status_updated:
2025-06-11T20:22:52.719+0000 7f537c22a640 10 rbd::mirror::ImageReplayer: 0x556477759b80 [3/71aae58d-b2c4-412d-b54b-fbfdafcc1094] handle_replayer_notification:
2025-06-11T20:22:52.719+0000 7f537c22a640 10 rbd::mirror::ImageReplayer: 0x556477759b80 [3/71aae58d-b2c4-412d-b54b-fbfdafcc1094] handle_replayer_notification: resync requested
2025-06-11T20:22:52.719+0000 7f537c22a640 10 rbd::mirror::ImageReplayer: 0x556477759b80 [3/71aae58d-b2c4-412d-b54b-fbfdafcc1094] on_stop_replay:
2025-06-11T20:22:52.719+0000 7f537c22a640 10 rbd::mirror::ImageReplayer: 0x556477759b80 [3/71aae58d-b2c4-412d-b54b-fbfdafcc1094] cancel_update_mirror_image_replay_status:
2025-06-11T20:22:52.719+0000 7f537c22a640 10 rbd::mirror::ImageReplayer: 0x556477759b80 [3/71aae58d-b2c4-412d-b54b-fbfdafcc1094] set_state_description: r=0, desc=resync requested
2025-06-11T20:22:52.719+0000 7f537c22a640 15 rbd::mirror::ImageReplayer: 0x556477759b80 [3/71aae58d-b2c4-412d-b54b-fbfdafcc1094] update_mirror_image_status: force=1, state=--
2025-06-11T20:22:52.719+0000 7f537c22a640 10 rbd::mirror::ImageReplayer: 0x556477759b80 [3/71aae58d-b2c4-412d-b54b-fbfdafcc1094] shut_down: r=0
...
2025-06-11T20:22:52.731+0000 7f535c9eb640  5 rbd::mirror::ImageReplayer: 0x556477759b80 [3/71aae58d-b2c4-412d-b54b-fbfdafcc1094] handle_shut_down: moving image to trash
2025-06-11T20:22:52.731+0000 7f535c9eb640 10 rbd::mirror::ImageDeleter:  trash_move: global_image_id=71aae58d-b2c4-412d-b54b-fbfdafcc1094, resync=1

So the image was deleted by rbd-mirror daemon on cluster1 at 2025-06-11T20:22:52.731+0000. But according to the test logic, all rbd-mirror daemons on cluster1 should have been stopped at that time:

2025-06-11T20:22:44.739 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:2025-06-11 20:22:44 TEST: request image resync when remote is not primary
...
2025-06-11T20:22:50.069 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ wait_for_status_in_pool_dir cluster1 mirror test_resync_image up+replaying primary_position
...
2025-06-11T20:22:51.251 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ wait_for_status_in_pool_dir cluster2 mirror test_resync_image up+stopped
...
2025-06-11T20:22:52.393 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ stop_mirrors cluster1
2025-06-11T20:22:52.393 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ local cluster=cluster1
2025-06-11T20:22:52.393 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ local sig=
2025-06-11T20:22:52.394 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:++ seq 0 3
2025-06-11T20:22:52.395 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ for instance in `seq 0 ${LAST_MIRROR_INSTANCE}`
2025-06-11T20:22:52.395 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ stop_mirror cluster1:0 ''
2025-06-11T20:22:52.395 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ local cluster=cluster1:0
2025-06-11T20:22:52.395 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ local sig=
2025-06-11T20:22:52.396 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ test -n 1
2025-06-11T20:22:52.396 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ return
2025-06-11T20:22:52.396 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ for instance in `seq 0 ${LAST_MIRROR_INSTANCE}`
2025-06-11T20:22:52.396 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ stop_mirror cluster1:1 ''
2025-06-11T20:22:52.396 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ local cluster=cluster1:1
2025-06-11T20:22:52.396 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ local sig=
2025-06-11T20:22:52.396 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ test -n 1
2025-06-11T20:22:52.396 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ return
2025-06-11T20:22:52.396 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ for instance in `seq 0 ${LAST_MIRROR_INSTANCE}`
2025-06-11T20:22:52.396 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ stop_mirror cluster1:2 ''
2025-06-11T20:22:52.396 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ local cluster=cluster1:2
2025-06-11T20:22:52.397 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ local sig=
2025-06-11T20:22:52.397 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ test -n 1
2025-06-11T20:22:52.397 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ return
2025-06-11T20:22:52.397 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ for instance in `seq 0 ${LAST_MIRROR_INSTANCE}`
2025-06-11T20:22:52.397 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ stop_mirror cluster1:3 ''
2025-06-11T20:22:52.397 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ local cluster=cluster1:3
2025-06-11T20:22:52.397 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ local sig=
2025-06-11T20:22:52.397 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ test -n 1
2025-06-11T20:22:52.397 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ return
...
2025-06-11T20:22:52.397 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ write_image cluster2 mirror test_resync_image 100
...
2025-06-11T20:22:52.515 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ demote_image cluster2 mirror test_resync_image
...
2025-06-11T20:22:52.589 INFO:tasks.workunit.cluster1.client.mirror.smithi196.stderr:+ request_resync_image cluster1 mirror test_resync_image test_resync_image_id

@VinayBhaskar-V I see two issues here:

  1. stop_mirrors() basically ended up being a no-op (nothing got stopped) -- it's most likely a problem with the test or the helper
  2. we don't want the image to be deleted if rbd-mirror daemon continues running -- in fact the assumption was that the scenario where rbd-mirror daemon continues running is already handled and it's only the scenario with a stopped rbd-mirror daemon that needs fixing

(1) should be fairly easy to track down -- it's likely a problem with how the test is set up or the helper. (2) may require a change in the approach... I'd probably start with putting together a similar test case where rbd-mirror daemon doesn't get stopped to gauge the extent of the problem.

@idryomov
Copy link
Copy Markdown
Contributor

idryomov commented Jun 11, 2025

(1) should be fairly easy to track down -- it's likely a problem with how the test is set up or the helper

Yup, the test just needs to be conditioned on RBD_MIRROR_USE_RBD_MIRROR being 0 and there happens to be an existing branch for that immediately above it ;)

@VinayBhaskar-V
Copy link
Copy Markdown
Contributor Author

VinayBhaskar-V commented Jun 12, 2025

{2) we don't want the image to be deleted if rbd-mirror daemon continues running -- in fact the assumption was that the scenario where rbd-mirror daemon continues running is already handled and it's only the scenario with a stopped rbd-mirror daemon that needs fixing

While the rbd-mirror deamon continues :
If the resync on secondary is triggered before completion of demote on primary then it's leading to deletion of group on secondary and primary group is left in up+error with error bootstrapping replay.

root:build# bin/rbd --cluster site-a mirror image status data/img2
img2:
  global_id:   bcb2b734-ffb0-4584-8d7b-573c0f56c822
  state:       up+error
  description: error bootstrapping replay
  service:     admin on li-43492fcc-29ea-11b2-a85c-cfc77a88c201.ibm.com
  last_update: 2025-06-12 11:24:23
  peer_sites:
    name: site-b
    state: up+unknown
    description: remote image is not primary
    last_update: 2025-06-12 11:24:23

root:build# 
root:build# bin/rbd --cluster site-b mirror image status data/img2
rbd: error opening image img2: (2) No such file or directory

If the resync is triggered after completion of demotion on primary then everything is going as expected.

@VinayBhaskar-V
Copy link
Copy Markdown
Contributor Author

@idryomov I have made changes in snapshot replayer and it resolves the issue of deletion of image on secondary when resync is requested even when the rbd-mirror deamon is running. I have also tested the rbd_mirror.sh for snapshot mode, all tests passed.
PTAL at the new commit.

@idryomov
Copy link
Copy Markdown
Contributor

PTAL at the new commit

I see a problem there. Consider a scenario where the remote image is primary and the local image is non-primary but also somehow corrupted such that rbd-mirror daemon can't make sense of it -- one example would be invalid local mirror snapshot state error. The user notices the repeated error, scratches their head for a while and then decides to resync. The remote image is primary which means that resyncing should be allowed but it would never happen because the replayer would never get to the point where the resync marker is checked now. The replayer would shut itself down on that error, get restarted some time later, shut itself down again and so on.

I'd try dealing with the resync marker after the remote image is refreshed (i.e. after refresh_remote_image()). To determine whether the remote image is primary, check whether the most recent mirror snapshot is primary or primary demoted -- after the refresh remote_image_ctx->snap_info should be up-to-date. To avoid as many potential errors around the local image as possible, we could even reorder the refreshes so that the remote image is refreshed before the local image. What do you think?

@VinayBhaskar-V
Copy link
Copy Markdown
Contributor Author

The replayer would shut itself down on that error, get restarted some time later, shut itself down again and so on.

Got it.Thanks for the clarification.

I'd try dealing with the resync marker after the remote image is refreshed (i.e. after refresh_remote_image()). To determine whether the remote image is primary, check whether the most recent mirror snapshot is primary or primary demoted -- after the refresh remote_image_ctx->snap_info should be up-to-date.

I think we can do like this but i have a small doubt is there a guarantee that the most recent mirror snapshot always contains the remote_mirror_peer_uuid? Like not in the current case of 2 clusters, if there are multiple peers, will the case still be same?
Because we currently skips the remote_snaps which don't have the m_remote_mirror_peer_uuid in scan_remote_mirror_snapshots

} else if (mirror_ns->mirror_peer_uuids.count(m_remote_mirror_peer_uuid) ==
                 0) {
      dout(15) << "skipping remote snapshot due to missing mirror peer"
               << dendl;
      continue;
    }

So i am thinking to add something like this in handle_refresh_remote_image after reordering the refreshes.

auto remote_image_ctx = m_state_builder->remote_image_ctx;
for (auto snap_info_it = remote_image_ctx->snap_info.rbegin();
    snap_info_it != remote_image_ctx->snap_info.rend(); ++snap_info_it) {
  const auto& snap_ns = snap_info_it->second.snap_namespace;
  auto mirror_ns = std::get_if<
    cls::rbd::MirrorSnapshotNamespace>(&snap_ns);
  if (mirror_ns == nullptr) {
    continue;
  }
  if (mirror_ns->mirror_peer_uuids.count(m_remote_mirror_peer_uuid) !=0 ) {
    if(m_state_builder->local_image_meta->resync_requested &&
       mirror_ns->state == cls::rbd::MIRROR_SNAPSHOT_STATE_PRIMARY){
      m_resync_requested = true;
      dout(10) << "local image resync requested" << dendl;
      handle_replay_complete(0, "resync requested");
      return;
    }
    break;
  }
}

Please correct me if i am wrong.

@idryomov
Copy link
Copy Markdown
Contributor

i have a small doubt is there a guarantee that the most recent mirror snapshot always contains the remote_mirror_peer_uuid?

I don't think it should be considered. The disposition (promotion state) of the image is global and is determined by the most recent mirror snapshot -- the image can't be primary for one peer and non-primary for another peer in case of multiple peers. When the user runs rbd mirror image demote the image would get demoted for all peers.

Note that GetInfoRequest::calc_promotion_state() doesn't take mirror peers into account and yet its result is used for e.g. StateBuilder::is_remote_primary() which is what the previous version of the PR was built upon.

@VinayBhaskar-V VinayBhaskar-V force-pushed the wip-resync-changes branch 2 times, most recently from 78bf3d3 to d7d384f Compare June 24, 2025 15:18
Copy link
Copy Markdown
Contributor

@ajarr ajarr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a question in a test, and have requested a change in state diagram.

Otherwise, looks very good

Copy link
Copy Markdown
Contributor

@ajarr ajarr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good.

@ajarr
Copy link
Copy Markdown
Contributor

ajarr commented Jul 16, 2025

The gate job failures look unrelated

@VinayBhaskar-V
Copy link
Copy Markdown
Contributor Author

jenkins test api

@VinayBhaskar-V
Copy link
Copy Markdown
Contributor Author

jenkins test make check

@idryomov
Copy link
Copy Markdown
Contributor

A resync on a mirrored image may incorrectly results in the local
image being deleted even when the remote image is no longer primary.
This issue can occur under the following conditions:
* if  resync is requested on the secondary before the remote image has
  been fully demoted
* if the demotion of the primary image is not mirrored
  due to the rbd-mirror daemon being offline.

This can be fixed by ensuring that image deletion during a resync is
only allowed when the remote image is confirmed to be primary.

This commit fixes the issue only for snapshot based mirroring mode

Fixes: https://tracker.ceph.com/issues/70948
Signed-off-by: VinayBhaskar-V <vvarada@redhat.com>
@idryomov
Copy link
Copy Markdown
Contributor

jenkins test make check

@idryomov
Copy link
Copy Markdown
Contributor

jenkins test make check arm64

1 similar comment
@idryomov
Copy link
Copy Markdown
Contributor

jenkins test make check arm64

@idryomov
Copy link
Copy Markdown
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants