os/bluestore: Fix unnecessary onode cache rm/add during split_cache#32536
os/bluestore: Fix unnecessary onode cache rm/add during split_cache#32536markhpc wants to merge 2 commits intoceph:masterfrom
Conversation
Signed-off-by: Mark Nelson <mnelson@redhat.com>
Signed-off-by: Mark Nelson <mnelson@redhat.com>
ifed01
left a comment
There was a problem hiding this comment.
Technically the fix is safe (with the nit I mentioned in another comment) but
I think that the following sentence isn't a proper analysis: " and also may resolve a segfault that is hit when removing the onode from the shard causes the reference count to go to zero and the onode is destroyed (and thus can't safely be re-added). "
Collection::split_cache keeps an onode reference (OnodeRef o) while moving between cache shards and hence reference count can't go to zero... So I suppose the bug is still unresolved.
| dest->onode_map.onode_map[o->oid] = o; | ||
| dest->onode_map.cache = dest->onode_map.cache; | ||
|
|
||
| if (dest->cache != cache) { |
There was a problem hiding this comment.
IMO this check is a bit confusing, proper one should rather be:
if (onode_map.cache != dest->onode_map.cache)
...
There was a problem hiding this comment.
@ifed01 Josh and I spent some time in gdb looking at the state of the ref counting. I don't remember exactly everything we determined, but I believe we convinced ourselves that this was indeed due to issues with the ref counting (specifically due to the new implementation of the cache triming). Given that Kefu was able to hit it again though, it appears to happen in general rather than specific to situations where both of the resulting collections are in the same cache shard.
There was a problem hiding this comment.
@ifed01 fwiw, I agree on the check, though I think it's currently working by accident (the buffer cache check should hash out the same way). I don't think this is what's causing the assert kefu saw. For some reason the cache shard is not null when _add is called, but I haven't been able to figure out why yet.
|
|
This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days. |
|
@markhpc do we still want this? |
|
This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days. |
|
This pull request has been automatically closed because there has been no activity for 90 days. Please feel free to reopen this pull request (or open a new one) if the proposed change is still appropriate. Thank you for your contribution! |
This PR adds a check in split_cache so that when we are splitting a collection such that both of the new collections are within the same cache shard, we don't actually remove and re-add the onode to that shard. This should both be a slight efficiency gain and also may resolve a segfault that is hit when removing the onode from the shard causes the reference count to go to zero and the onode is destroyed (and thus can't safely be re-added). Strangely, this has only appeared to happen so far when both of the resulting collections are on the same cache shard. If the issue continues, we can optionally further ensure that the ref count doesn't drop or simply avoid adding the onode to the new cache shard at this time.
See:
https://tracker.ceph.com/issues/43131
https://tracker.ceph.com/issues/43147
Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard backendjenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume tox