osd/PGLog: persist num_objects_missing for replicas when peering is done by xiexingguo · Pull Request #30466 · ceph/ceph

xiexingguo · 2019-09-19T02:04:47Z

guoracle report that:

In the asynchronous recovery feature, the asynchronous recovery
target OSD is selected by last_updata.version, so that after the
peering is completed, the asynchronous recovery target OSDs update
the last_update.version, and then go down again, when the asynchronous
recovery target OSDs is back online, when peering,there is no pglog
difference between the asynchronous recovery targets and the
authoritative OSD, resulting in no asynchronous recovery.

#24004 aimed to solve the problem by
persisting the number of missing objects into the disk when peering was
done, and then we could take both new approximate missing objects
(estimated according to last_update) and historical num_objects_missing
into account when determining async_recovery_targets on any new follow-up
peering circles.
However, the above comment stands only if we could keep an up-to-date
num_objects_missing field for each pg instance under any circumstances,
which is unfortunately not true for replicas which have completed peering
but never started recovery later (7de3562
make sure we'll update num_objects_missing for primary when peering is done,
and will keep num_objects_missing up-to-update when each missing object
is recovered).

Note that guoracle also suggests to fix the same problem by using
last_complete.version to calculate the pglog difference and update the
last_complete of the asynchronous recovery target OSD in the copy of peer_info
to the latest after the recovery is complete, which should not work well
because we might reset last_complete to 0'0 whenever we trim pglog past the
minimal need-version of missing set.

Fix by persisting num_objects_missing for replicas correctly when peering
is done.

Fixes: https://tracker.ceph.com/issues/41924
Signed-off-by: xie xingguo xie.xingguo@zte.com.cn

Checklist

References tracker ticket
Updates documentation if necessary
Includes tests for new functionality or reproducer for bug

Show available Jenkins commands

jenkins retest this please
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard backend
jenkins test docs
jenkins render docs

guoracle reported that: > In the asynchronous recovery feature, the asynchronous recovery > target OSD is selected by last_updata.version, so that after the > peering is completed, the asynchronous recovery target OSDs update > the last_update.version, and then go down again, when the asynchronous > recovery target OSDs is back online, when peering,there is no pglog > difference between the asynchronous recovery targets and the > authoritative OSD, resulting in no asynchronous recovery. ceph#24004 aimed to solve the problem by persisting the number of missing objects into the disk when peering was done, and then we could take both new approximate missing objects (estimated according to last_update) and historical num_objects_missing into account when determining async_recovery_targets on any new follow-up peering cycles. However, the above comment stands only if we could keep an up-to-date num_objects_missing field for each pg instance under any circumstances, which is unfortunately not true for replicas which have completed peering but never started recovery later (7de3562 make sure we'll update num_objects_missing for primary when peering is done, and will keep num_objects_missing up-to-update when each missing object is recovered). Note that guoracle also suggests to fix the same problem by using last_complete.version to calculate the pglog difference and update the last_complete of the asynchronous recovery target OSD in the copy of peer_info to the latest after the recovery is complete, which should not work well because we might reset last_complete to 0'0 whenever we trim pglog past the minimal need-version of missing set. Fix by persisting num_objects_missing for replicas correctly when peering is done. Fixes: https://tracker.ceph.com/issues/41924 Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>

xiexingguo · 2019-09-20T14:09:27Z

@liewegas ping?

xiexingguo · 2019-09-27T02:18:15Z

http://pulpito.ceph.com/xxg-2019-09-26_03:03:48-rados-wip-xxg-testing-2019-09-26-0820-distro-basic-smithi/

xiexingguo added bug-fix core labels Sep 19, 2019

xiexingguo requested review from jdurgin and liewegas September 19, 2019 02:04

xiexingguo mentioned this pull request Sep 19, 2019

mimic: core: choose async recovery targets by last_complete #30459

Closed

xiexingguo force-pushed the wip-41924 branch 2 times, most recently from ba81b98 to 3b024c5 Compare September 20, 2019 00:17

liewegas requested a review from neha-ojha September 20, 2019 14:15

neha-ojha approved these changes Sep 24, 2019

View reviewed changes

neha-ojha added the needs-qa label Sep 24, 2019

xiexingguo added the wip-xxg-testing label Sep 26, 2019

xiexingguo merged commit 2914402 into ceph:master Sep 27, 2019

xiexingguo deleted the wip-41924 branch September 27, 2019 02:18

smithfarm mentioned this pull request Oct 23, 2019

nautilus: core: osd/PGLog: persist num_objects_missing for replicas when peering is done #31077

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

osd/PGLog: persist num_objects_missing for replicas when peering is done#30466

osd/PGLog: persist num_objects_missing for replicas when peering is done#30466
xiexingguo merged 1 commit intoceph:masterfrom
xiexingguo:wip-41924

xiexingguo commented Sep 19, 2019

Uh oh!

xiexingguo commented Sep 20, 2019

Uh oh!

xiexingguo commented Sep 27, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

xiexingguo commented Sep 19, 2019

Checklist

Uh oh!

xiexingguo commented Sep 20, 2019

Uh oh!

xiexingguo commented Sep 27, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants