os/BlueStore: NCB fix for leaked space when bdev_async_discard is ena…#56744
Merged
os/BlueStore: NCB fix for leaked space when bdev_async_discard is ena…#56744
Conversation
ronen-fr
reviewed
Apr 7, 2024
ronen-fr
reviewed
Apr 7, 2024
ronen-fr
reviewed
Apr 7, 2024
ifed01
reviewed
Apr 8, 2024
ronen-fr
reviewed
Apr 8, 2024
baergj
reviewed
Apr 8, 2024
ronen-fr
reviewed
Apr 8, 2024
…bled Fix calls bdev->discard_drain() before calling store_allocator() to make sure all freed space is reflected in the allocator before destaging it The fix set a timeout for the drain call (500msec) and if expires will not store the allocator (forcing a recovery on the next startup) Fixes: https://tracker.ceph.com/issues/65298 Signed-off-by: Gabriel BenHanokh <gbenhano@redhat.com>
…toring the allocator. ON fast shutdown we will simply copy the discard queue entries to the allocator Signed-off-by: Gabriel BenHanokh <gbenhano@redhat.com>
On fast-shutdown take over the main discarded queue copying it to the allocator and only wait for the threads to commit their small private discarded queues Signed-off-by: Gabriel BenHanokh <gbenhano@redhat.com>
ea535c9 to
1e10a9b
Compare
baergj
approved these changes
Apr 9, 2024
ifed01
reviewed
Apr 9, 2024
| virtual bool try_discard(interval_set<uint64_t> &to_release, bool async=true) { return false; } | ||
| virtual void discard_drain() { return; } | ||
|
|
||
| virtual void swap_discard_queued(interval_set<uint64_t>& other) { other.clear(); } |
Contributor
There was a problem hiding this comment.
may be name this func as cancel_discards() ?
Signed-off-by: Gabriel BenHanokh <gbenhano@redhat.com>
ifed01
approved these changes
Apr 10, 2024
Contributor
Author
|
jenkins test api |
Contributor
Author
|
jenkins test make check arm64 |
Contributor
@benhanokh - don't bother. It won't work - and it's not a blocker |
14 tasks
pereman2
approved these changes
Jun 10, 2024
14 tasks
14 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
…bled
On graceful shutdown we call bdev->discard_drain() before calling store_allocator() to make sure all freed space is reflected in the allocator before destaging it.
On fast shutdown we remove the discarded queue and copy all its entries into the allocator before calling store_allocator().
This is logically identical to the behavior before NCB when the freed space was reflected in column-B entry in RocksDB which was used to construct the allocator after shutdown even if the free operation didn't complete.
The PR adds a drain wait for all discarded entries in the worker-threads private space to be committed.
We had to limit the private space entries to only 10 entries to guarantee that the drain will finish in timely manner
Fixes: https://tracker.ceph.com/issues/65298
Signed-off-by: Gabriel BenHanokh gbenhano@redhat.com
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
xbetween the brackets:[x]. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windowsjenkins test rook e2e