luminous: os/bluestore: Added rescue procedure for bluefs log replay by aclamk · Pull Request #35776 · ceph/ceph

aclamk · 2020-06-25T13:22:12Z

This is a recovery procedure for cases where bluefs log grew so much that it cannot be read.
This fixes https://tracker.ceph.com/issues/46195 .

Recommended procedure:

Check if recovery procedure works in your case:
ceph-bluestore-tool -l /proc/self/fd/1 --log-level 5 --path dev/osd1 fsck --debug_bluefs=5/5 --bluefs_replay_recovery=true --bluefs_replay_recovery_disable_compact=true
Fsck should complete with minimal errors.
Apply recovery
ceph-bluestore-tool -l /proc/self/fd/1 --log-level 5 --path dev/osd1 fsck --debug_bluefs=5/5 --bluefs_replay_recovery=true

ifed01 · 2020-06-25T13:47:52Z

@aclamk - we do want the same in master, don't we?

aclamk · 2020-06-25T14:45:05Z

@ifed01 EDIT: I think it should go to all versions, but luminous is urgent.
Master may be exempt, since we merged #35473 .

ifed01 · 2020-06-25T15:12:26Z

@aclamk - this might look like a grumbling but IMO that's a bad procedure that is bringing a mess into commit history... May be try pushing hard the patch forward through the master first?

aclamk · 2020-06-25T19:34:41Z

@ifed01 Problem is, that master does not need it. I would be unable to recreate state that is begin fixed here on master.

neha-ojha · 2020-06-26T00:03:27Z

@ifed01 Problem is, that master does not need it. I would be unable to recreate state that is begin fixed here on master.

you could add an explanation in your commit message to explain why this commit is not cherry-picked from master

yuriw · 2020-07-08T20:21:03Z

wip-yuri4-testing-2020-07-08-2015-luminous

yuriw · 2020-07-13T17:06:34Z

@neha-ojha this the one was part of #35776
pls merge
https://pulpito.ceph.com/nojha-2020-07-09_23:13:44-rados-wip-yuri4-testing-2020-07-08-2015-luminous-distro-basic-smithi/

This is a procedure tries to find on disk unreachable extents and pretend they were already a part of bluefs log. If this gives proper crc, accept it. Fixes: https://tracker.ceph.com/issues/46195 Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>

This ability only makes sense as a step that allows to perform fsck before commiting recovered bluefs log. Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>

Adds additional paragraph to ceph-bluestore-tool documentation, describing how to use *special* options --bluefs_replay_recovery and --bluefs_replay_recovery_disable_compact to recover large bluefs log. Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>

aclamk force-pushed the wip-bluefs-log-replay-rescue-lum branch from f722ef8 to c6674ed Compare June 25, 2020 13:24

aclamk requested review from ifed01, jdurgin and neha-ojha June 25, 2020 13:42

neha-ojha added the bluestore label Jun 26, 2020

ifed01 changed the title ~~os/bluestore: Added rescue procedure for bluefs log replay~~ luminous:os/bluestore: Added rescue procedure for bluefs log replay Jun 26, 2020

tchaikov added this to the luminous milestone Jun 27, 2020

aclamk force-pushed the wip-bluefs-log-replay-rescue-lum branch from 67cae05 to 564c246 Compare June 28, 2020 09:18

neha-ojha added needs-qa wip-neha-testing labels Jun 30, 2020

neha-ojha removed the wip-neha-testing label Jul 8, 2020

yuriw added wip-yuri5-testing wip-yuri4-testing and removed wip-yuri5-testing labels Jul 8, 2020

yuriw added luminous-batch-1 TESTED and removed wip-yuri4-testing labels Jul 13, 2020

aclamk added 3 commits July 15, 2020 21:58

os/bluestore: Add ability to disable compaction

dfb3d6d

This ability only makes sense as a step that allows to perform fsck before commiting recovered bluefs log. Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>

aclamk force-pushed the wip-bluefs-log-replay-rescue-lum branch from a7b7f3e to cc4c069 Compare July 15, 2020 20:07

neha-ojha approved these changes Jul 15, 2020

View reviewed changes

neha-ojha removed the needs-qa label Jul 15, 2020

neha-ojha merged commit 5a69e6b into ceph:luminous Jul 15, 2020

smithfarm changed the title ~~luminous:os/bluestore: Added rescue procedure for bluefs log replay~~ luminous: os/bluestore: Added rescue procedure for bluefs log replay Aug 13, 2020

aclamk mentioned this pull request Aug 3, 2021

os/bluestore: fix bluefs log run out of space #41888

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

luminous: os/bluestore: Added rescue procedure for bluefs log replay#35776

luminous: os/bluestore: Added rescue procedure for bluefs log replay#35776
neha-ojha merged 3 commits intoceph:luminousfrom
aclamk:wip-bluefs-log-replay-rescue-lum

aclamk commented Jun 25, 2020 •

edited

Loading

Uh oh!

ifed01 commented Jun 25, 2020

Uh oh!

aclamk commented Jun 25, 2020 •

edited

Loading

Uh oh!

ifed01 commented Jun 25, 2020

Uh oh!

aclamk commented Jun 25, 2020

Uh oh!

neha-ojha commented Jun 26, 2020

Uh oh!

yuriw commented Jul 8, 2020

Uh oh!

yuriw commented Jul 13, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

aclamk commented Jun 25, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ifed01 commented Jun 25, 2020

Uh oh!

aclamk commented Jun 25, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ifed01 commented Jun 25, 2020

Uh oh!

aclamk commented Jun 25, 2020

Uh oh!

neha-ojha commented Jun 26, 2020

Uh oh!

yuriw commented Jul 8, 2020

Uh oh!

yuriw commented Jul 13, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

aclamk commented Jun 25, 2020 •

edited

Loading

aclamk commented Jun 25, 2020 •

edited

Loading