Skip to content

os/bluestore: Prune deleted onodes#38499

Merged
ifed01 merged 4 commits intoceph:masterfrom
aclamk:wip-bs-onode-prune-deleted
Dec 22, 2020
Merged

os/bluestore: Prune deleted onodes#38499
ifed01 merged 4 commits intoceph:masterfrom
aclamk:wip-bs-onode-prune-deleted

Conversation

@aclamk
Copy link
Contributor

@aclamk aclamk commented Dec 9, 2020

This PR is about cleanup of remains of onodes that are lingering in onode cache after deletion.
When deleting collections large amounts of nodes that are logically deleted exists == false eat up space, inflicting cache pressure on existing ones.
Now when all reference counters for onode except one in onode_map are relinquished, and onode has been deleted, it is removed from cache and from onode_map.

This PR supersedes #38423.

Got rid of OnodeCacheShard pin() and unpin() functions.
Moved their validator logic right into Onode put and get functions.

Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>
@tchaikov
Copy link
Contributor

dropping from my batch as it conflicts with #37496

@aclamk aclamk force-pushed the wip-bs-onode-prune-deleted branch from b539044 to 5d86b67 Compare December 10, 2020 09:22
Copy link
Contributor

@ifed01 ifed01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!
IMO can be improved (in terms of fewer line numbers) a bit as per my comments.


void _rm_pinned(BlueStore::Onode* o) override
{
ceph_assert(!o->pinned);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the function name this assertion looks a bit confusing. May be remove it totally similar to _unpin()?
And rename to _unpin_and_rm()?
Not a big deal though...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ifed01 Confusion is bad. I renamed it and deleted assert.

Added logic for erasing onode from onode_map it is last reference and exists==false.

Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>
…ation

When onode.exists == false getting reference and then releasing it might delete it from container.
It must not happen during iteration.

Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>
Just making sure onode does not get deleted.

Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>
@aclamk aclamk force-pushed the wip-bs-onode-prune-deleted branch from 70d1e52 to 01ff70d Compare December 14, 2020 15:26
@tchaikov
Copy link
Contributor

tchaikov commented Dec 18, 2020

the LRC cluster in sepia is very slow at this moment, so i cannot read the coredump of /a/kchai-2020-12-18_01:47:10-rados-wip-kefu-testing-2020-12-17-1542-distro-basic-smithi/5716726/remote/*/coredump/1608257256.17994.core.gz at this moment. so i need to wait until the coredump is readable before moving forwards.

Copy link
Member

@jdurgin jdurgin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reviewed in the course of looking through #38651

@ifed01 ifed01 merged commit d5a2611 into ceph:master Dec 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants