Skip to content

os/bluestore: Fix unnecessary onode cache rm/add during split_cache#32536

Closed
markhpc wants to merge 2 commits intoceph:masterfrom
markhpc:wip-bs-split_cache
Closed

os/bluestore: Fix unnecessary onode cache rm/add during split_cache#32536
markhpc wants to merge 2 commits intoceph:masterfrom
markhpc:wip-bs-split_cache

Conversation

@markhpc
Copy link
Copy Markdown
Member

@markhpc markhpc commented Jan 7, 2020

This PR adds a check in split_cache so that when we are splitting a collection such that both of the new collections are within the same cache shard, we don't actually remove and re-add the onode to that shard. This should both be a slight efficiency gain and also may resolve a segfault that is hit when removing the onode from the shard causes the reference count to go to zero and the onode is destroyed (and thus can't safely be re-added). Strangely, this has only appeared to happen so far when both of the resulting collections are on the same cache shard. If the issue continues, we can optionally further ensure that the ref count doesn't drop or simply avoid adding the onode to the new cache shard at this time.

See:
https://tracker.ceph.com/issues/43131
https://tracker.ceph.com/issues/43147

Checklist

  • References tracker ticket
  • Updates documentation if necessary
  • Includes tests for new functionality or reproducer for bug

Show available Jenkins commands
  • jenkins retest this please
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard backend
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox

Mark Nelson added 2 commits January 7, 2020 17:46
Signed-off-by: Mark Nelson <mnelson@redhat.com>
Signed-off-by: Mark Nelson <mnelson@redhat.com>
Copy link
Copy Markdown
Contributor

@ifed01 ifed01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically the fix is safe (with the nit I mentioned in another comment) but
I think that the following sentence isn't a proper analysis: " and also may resolve a segfault that is hit when removing the onode from the shard causes the reference count to go to zero and the onode is destroyed (and thus can't safely be re-added). "
Collection::split_cache keeps an onode reference (OnodeRef o) while moving between cache shards and hence reference count can't go to zero... So I suppose the bug is still unresolved.

dest->onode_map.onode_map[o->oid] = o;
dest->onode_map.cache = dest->onode_map.cache;

if (dest->cache != cache) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this check is a bit confusing, proper one should rather be:
if (onode_map.cache != dest->onode_map.cache)
...

Copy link
Copy Markdown
Member Author

@markhpc markhpc Jan 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ifed01 Josh and I spent some time in gdb looking at the state of the ref counting. I don't remember exactly everything we determined, but I believe we convinced ourselves that this was indeed due to issues with the ref counting (specifically due to the new implementation of the cache triming). Given that Kefu was able to hit it again though, it appears to happen in general rather than specific to situations where both of the resulting collections are in the same cache shard.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ifed01 fwiw, I agree on the check, though I think it's currently working by accident (the buffer cache check should hash out the same way). I don't think this is what's causing the assert kefu saw. For some reason the cache shard is not null when _add is called, but I haven't been able to figure out why yet.

@tchaikov
Copy link
Copy Markdown
Contributor

tchaikov commented Jan 9, 2020

 ceph version 15.0.0-9142-g4359dde (4359dde84d38890249873efbc290902c51796ba2) octopus (dev)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x156) [0x5599ed00d478]
 2: (()+0x507692) [0x5599ed00d692]
 3: (std::_Sp_counted_ptr_inplace<PriorityCache::Manager, std::allocator<PriorityCache::Manager>, (__gnu_cxx::_Lock_policy)2>::_M_dispose()+0) [0x5599ed5fd190]
 4: (BlueStore::Collection::split_cache(BlueStore::Collection*)+0x530) [0x5599ed5be060]
 5: (BlueStore::_split_collection(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Collection>&, unsigned int, int)+0x336) [0x5599ed5bf016]
 6: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, ceph::os::Transaction*)+0x9d8) [0x5599ed5dc838]
 7: (BlueStore::queue_transactions(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, std::vector<ceph::os::Transaction, std::allocator<ceph::os::Transaction> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x3d8) [0x5599ed5df508]
 8: (ObjectStore::queue_transaction(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ceph::os::Transaction&&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x84) [0x5599ed15a5f4]
 9: (OSD::dispatch_context(PeeringCtx&, PG*, std::shared_ptr<OSDMap const>, ThreadPool::TPHandle*)+0x102) [0x5599ed0f2c52]
 10: (OSD::dequeue_peering_evt(OSDShard*, PG*, std::shared_ptr<PGPeeringEvent>, ThreadPool::TPHandle&)+0x304) [0x5599ed11f8d4]
 11: (ceph::osd::scheduler::PGPeeringItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x56) [0x5599ed34fff6]
 12: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x1303) [0x5599ed1133d3]
 13: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x5c4) [0x5599ed73d9f4]
 14: (ShardedThreadPool::WorkThreadSharded::entry()+0x14) [0x5599ed73fcb4]
 15: (()+0x82de) [0x7f0cb82c62de]

@markhpc
Copy link
Copy Markdown
Member Author

markhpc commented Jan 15, 2020

@stale
Copy link
Copy Markdown

stale bot commented Mar 15, 2020

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@stale stale bot added the stale label Mar 15, 2020
@neha-ojha neha-ojha removed the stale label Apr 30, 2020
@neha-ojha
Copy link
Copy Markdown
Member

@markhpc do we still want this?

@stale
Copy link
Copy Markdown

stale bot commented Jul 2, 2020

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@stale stale bot added the stale label Jul 2, 2020
@stale
Copy link
Copy Markdown

stale bot commented Oct 4, 2020

This pull request has been automatically closed because there has been no activity for 90 days. Please feel free to reopen this pull request (or open a new one) if the proposed change is still appropriate. Thank you for your contribution!

@stale stale bot closed this Oct 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants