Skip to content

quincy: OSD::Modify OSD Fast-Shutdown to work safely i.e. quiesce all activit…#45342

Merged
yuriw merged 3 commits intoceph:quincyfrom
benhanokh:wip-54523-quincy
Mar 16, 2022
Merged

quincy: OSD::Modify OSD Fast-Shutdown to work safely i.e. quiesce all activit…#45342
yuriw merged 3 commits intoceph:quincyfrom
benhanokh:wip-54523-quincy

Conversation

@benhanokh
Copy link
Contributor

backport tracker: https://tracker.ceph.com/issues/54523


backport of #44913
parent tracker: https://tracker.ceph.com/issues/53266

this backport was staged using ceph-backport.sh version 16.0.0.6848
find the latest version at https://github.com/ceph/ceph/blob/master/src/script/ceph-backport.sh

benhanokh and others added 3 commits March 10, 2022 20:59
quiesce all activities and destage allocations to disk before killing the OSD

    1) keep the old (unsafe) fast-shutdown when we are not using NCB (non null-manager())
    2) skip service.prepare_to_stop() which can take as much as 10 seconds
    3) skip debug options in fast-shutdown
    4) set_state(STATE_STOPPING) which will stop accepting new tasks to this OSD
    5) clear op_shardedwq queues, this is safe since we didn't started processing them
    6) stop timer
    7) drain osd_op_tp (no new items will be added)
    8) now we can safely call umount which will close_db/bluefs and will destage allocation to disk
    9) skip _shutdown_cache() when we are in the middle of a fast-shutdown
    10) increase debug level on fast-shutdown
    11) add option for bluestore_qfsck_on_mount to force scan on mount for all tests
    12) disable fsck-on-umount when running fast-shutdown
    13) add an option to increase debug level at fast-shutdown umount()
    14) set a time limit to fast-shutdown

    15) Bug-Fix BlueStore::pool_statfs don't access db after it was removed
    16) Fix error message for qfsck (error was caused by PR ceph#44563)

    17) make shutdown-timeout configurable

Fixes: https://tracker.ceph.com/issues/53266
Signed-off-by: Gabriel Benhanokh <gbenhano@redhat.com>
(cherry picked from commit 9b2a64a)
Fixes problem with sync compaction (_rewrite_log_and_layout_sync).
There was a problem with not updating log_seq after compacting log.

It cause to stop _replay log right after first transaction.

... 20 bluefs _replay 0x0:  op_dir_create sharding
... 20 bluefs _replay 0x0:  op_dir_link  sharding/def to 21
... 20 bluefs _replay 0x0:  op_jump_seq 1025
... 10 bluefs _read h 0x555557c46400 0x1000~1000 from file(ino 1 size 0x1000 mtime 0.000000 allocated 410000 alloc_commit 410000 extents [1:0x1540000~410000])
... 20 bluefs _read left 0xff000 len 0x1000
... 20 bluefs _read got 4096
... 10 bluefs _replay 0x1000: stop: seq 1025 != expected 1026

This is a product of bluefs fine grain locks refactor.

Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>
(cherry picked from commit 2f8e370)

Conflicts:
	src/test/objectstore/test_bluefs.cc
(cherry picked from commit 4fd98ce)
Close window for possibility to capture allocator state and bluefs state
that are not in sync.

Signed-off-by: Gabriel Benhanokh <gbenhano@redhat.com>
(cherry picked from commit 8d05255)
@idryomov
Copy link
Contributor

jenkins test windows

@yuriw
Copy link
Contributor

yuriw commented Mar 16, 2022

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants