mon: update PaxosService::cached_first_committed in PaxosService::maybe_trim()#19397
mon: update PaxosService::cached_first_committed in PaxosService::maybe_trim()#19397liewegas merged 1 commit intoceph:masterfrom
Conversation
liewegas
left a comment
There was a problem hiding this comment.
I think a better fix is to change hte check_osdmap_sub call to return (and do nothing) if not active, and then call check_osdmap_subs() in on_active(). the subscribe is already recording that the client wants the map; i don't think add a new ad hoc async way of sending the map is a good way to get there.
This might be a good time to check that the other check_sub methods also bail out if not active (and check their subs when they become active).
|
@jecluis want to take a look at this? i wonder if instead of calling the check_subs() from update_from_paxos() it should always happen in on_active() instead? |
|
@liewegas Thanks for your reply:-) |
|
For those other paths the service is active. It looks like it's just the handle_subscribe case where we aren't checking if the service is active. |
5ae93e9 to
6f3f156
Compare
|
I just modified this PR as suggested, and it looks fine in my local test. Please review this PR again, thanks:-) |
1135c32 to
0a4cc8b
Compare
|
Hi, everyone. It seems that the failure of docs build is not caused by this PR: Reading package lists... |
1adf414 to
7310adc
Compare
|
Um, Hi, everyone. Please review this PR, thanks:-) |
|
|
||
| void trim_cache() { | ||
| while (lru.size() > max_size) { | ||
| while (size > max_size) { |
There was a problem hiding this comment.
You're using "size" field as a substitute of "lru.size()" only here, so it'd be safer and easier to make it local for this function, something like
size_t size = lru.size();
while (size > max_size) {
contents.erase(lru.back().first);
lru.pop_back();
size--;
}
That way you'll save 8 bytes of memory and won't slow down _add() and _clear().
There was a problem hiding this comment.
that's not true. size is updated in _add() and clear(). but I'd suggest check __GLIBCXX__ and _GLIBCXX_USE_CXX11_ABI for this workaround. so we keep tracking the size only if libstdc++ with old ABI is used. like
#if defined(__GLIBCXX__) && !_GLIBCXX_USE_CXX11_ABI
#define SLOW_LIST_SIZE
#endif
....
#ifdef SLOW_LIST_SIZE
size_t _size;
void inc_size() {
++_size;
}
void dec_size() {
--_size;
}
size_t size() const {
return _size;
}
#else
void inc_size() {}
void dec_size() {}
size_t size() const {
return contents.size();
}
#endif
...
void _add(K key, V&& value) {
...
inc_size();
}what do you think?
/// edited per @xxhdx1985126 's comment
There was a problem hiding this comment.
@tchaikov yes, the value itself is updated in _add() and _clear(), but the actual use of it is made in trim_cache() (also, I think it's missing an update after a call to lru.splice(). My point is - I think that with a such usage pattern, having it count manually will leave extra code that'll need to be removed once we're sure that list<>.size() is no longer broken.
There was a problem hiding this comment.
@tchaikov Thanks for your advise.
By the way, do mean "#if defined(GLIBCXX) && !_GLIBCXX_USE_CXX11_ABI" instead of "#if defined(GLIBCXX) && _GLIBCXX_USE_CXX11_ABI"? I thought it's when _GLIBCXX_USE_CXX11_ABI is 0 that we need a manual count. Right?
There was a problem hiding this comment.
@branch-predictor oh, thanks, it makes sense!
@xxhdx1985126 ahh, right! updated the code snippet.
There was a problem hiding this comment.
@branch-predictor Thanks for your advise:-)
I think, since trim_cache() is called for every omap header creation, it could still lead to performance regression when doing massive file/directory creation even if we call list::size() outside the "while" block . Is this right?
There was a problem hiding this comment.
@xxhdx1985126 Only if you're doing really massive file/directory creation and lru's size is large enough (say, 4096 entries or more). Still, if you do just single size() call per creation, it will be overshadowed by everything else around. Even that can be reduced by calling trim_cache() randomly (with, say, 33% probability).
|
@liewegas We are testing @branch-predictor's way:-) |
c123dd3 to
8a18ecf
Compare
|
@branch-predictor @liewegas I've move the SimpleLRU size modification to #19813 , please review it when you have time:-) Thanks:-) |
|
i think, instead of guarding also, this avoids the weird |
|
@tchaikov Hi, thanks for the advise:-) |
|
yes, i expected it's a controversial change. but after pondering over this, i think it's okay. |
|
I think updating first_committed in Paxos::trim() makes perfect sense. There shouldn't be any need for a std::atomic, though, since trim() itself runs under the mon lock...? |
agreed. i missed that. |
|
@liewegas @tchaikov Hi, sorry, I don't quite follow you. first_committed seems not relevant to our problem which is caused by non-synchronized accesses to cached_first_committed. Do you mean that we should update first_committed in Paxos::trim() besides updating cached_first_committed in PaxosService::maybe_trim()? |
|
@xxhdx1985126 sorry, should be |
|
Basically add cached_first_committed = trim_to;, right? |
|
@liewegas exactly. something like diff --git a/src/mon/PaxosService.cc b/src/mon/PaxosService.cc
index de732c3223..e2b9376bdc 100644
--- a/src/mon/PaxosService.cc
+++ b/src/mon/PaxosService.cc
@@ -399,6 +399,7 @@ void PaxosService::maybe_trim()
MonitorDBStore::TransactionRef t = paxos->get_pending_transaction();
trim(t, get_first_committed(), trim_to);
put_first_committed(t, trim_to);
+ cached_first_committed = trim_to;
// let the service add any extra stuff
encode_trim_extra(t, trim_to); |
…be_trim() This should be able to avoid interleaving execution of Paxos::commit_finish() and check_sub(), which could lead to unexpected failure of ceph-mon. Fixes: http://tracker.ceph.com/issues/11332 Signed-off-by: Xuehan Xu <xuxuehan@360.cn> Signed-off-by: yupeng chen <chenyupeng-it@360.cn>
|
@tchaikov @liewegas
Would this happen? And would it lead to anything bad? |
|
this could happen. but it does not hurt. assuming
|
|
@tchaikov |
|
i think i've replied that this "does not hurt". if you have any concerns, please just spill out. i am not sure the Socratic method is more efficient here. |
|
@tchaikov |
This should be able to avoid interleaving execution of Paxos::commit_finish() and
check_sub(), which could lead to unexpected failure of ceph-mon.
Fixes: http://tracker.ceph.com/issues/11332
Signed-off-by: Xuehan Xu xuxuehan@360.cn
Signed-off-by: yupeng chen chenyupeng-it@360.cn