stats: Shared scopes phase 1, changes scope APIs to use ScopeSharedPtr, leaving ScopePtr alias behind#19791
Conversation
…fined type. Signed-off-by: Joshua Marantz <jmarantz@google.com>
…Ptr to make a smaller 1st review. Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
|
/retest |
|
Retrying Azure Pipelines: |
pradeepcrao
left a comment
There was a problem hiding this comment.
Hi Josh, this change looks good on its own. I will spend some time tomorrow looking at how this fits in with the rest of the changes.
| null_counter_(new NullCounterImpl(symbol_table)), | ||
| null_gauge_(new NullGaugeImpl(symbol_table)) {} | ||
| null_gauge_(new NullGaugeImpl(symbol_table)), | ||
| default_scope_(std::make_shared<ScopePrefixer>("", *this)) {} |
There was a problem hiding this comment.
Can we run into a situation wherein we have a shared_ptr to ScopePrefixer for the admin panel, but not the underlying scope?
There was a problem hiding this comment.
The lifetime concerns are not changed by this PR; the Store needs to outlive all the Scopes declared from it.
In general if you instantiate any Scope-derived object on the stack, and you will get some time of failure (e.g. a weak_ptr exception).
That is why I changed default_scope_ to be allocated with make_shared, enabling #19693 to hold onto a reference to it, and mirroring the implementation in thread_local_store.cc.
Signed-off-by: Joshua Marantz <jmarantz@google.com>
…g over the default scope an extra time. Signed-off-by: Joshua Marantz <jmarantz@google.com>
|
The main doubt I have is that can the scopes be held by the admin page indefinitely, or is there a timeout mechanism? If the scopes live through several xDS updates, there is the possibility of Envoy OOMing. |
|
This is a very insightful point! Let's dive in. I don't think in practice this is a problem assuming requests have server-enforced timeouts. At the end of any request the shared_ptrs will be released. But it would be good to see if that's actually true. Note we trust admin clients (a malicious one could just send /quitquitquit) but a broken admin client could just read the first chunk from a /stats request and then leave the connection open and start a new one. Note also that won't leak scopes quickly because it will just bump ref-counts on scopes that are held anyway in the configuration. The concern would be if xDS updates the config and drops the scope, held alive only by an in-progress admin request. In the end I think it's request timeouts that prevent this from being a problem. |
|
Also that's not an issue for this PR which is not changing the Admin behavior at all. |
Thanks, that was my only concern. LGTM |
|
Thanks Pradeep @ggreenway can you take a look? |
|
@ggreenway will you be able to take a look sometime? Or shall I re-assign? |
ggreenway
left a comment
There was a problem hiding this comment.
I think this looks fine. I'd like to get @mattklein123 's opinion on this also.
envoy/stats/scope.h
Outdated
| virtual ~Scope() = default; | ||
|
|
||
| /** @return a shared_ptr for this */ | ||
| ScopeSharedPtr makeShared() { return shared_from_this(); } |
There was a problem hiding this comment.
I think this should be named getShared() or similar; makeShared() is too similar to std::make_shared() but this one doesn't construct a new object.
Signed-off-by: Joshua Marantz <jmarantz@google.com>
…e_shared which creates a new object. Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
mattklein123
left a comment
There was a problem hiding this comment.
LGTM with small nit. I have some vague concerns about circular references and memory leaks but we will know soon enough. :)
/wait
envoy/stats/scope.h
Outdated
| // files with trivial changes. Furthermore, code that depends on the Envoy stats | ||
| // infrastructure that's not in this repo will stop compiling when we remove | ||
| // ScopePtr. We should fully remove this alias in a future PR and change all the | ||
| // references, once known parthers that might break from this change of a chance |
There was a problem hiding this comment.
| // references, once known parthers that might break from this change of a chance | |
| // references, once known consumers that might break from this change have a chance |
?
There was a problem hiding this comment.
looks good; done by hand to avoid what I fear would be a DCO violation :) Committing shortly.
| void forEachScope(SizeFn f_size, StatFn<const Scope> f_stat) const override { | ||
| if (f_size != nullptr) { | ||
| f_size(1); | ||
| f_size(scopes_.size() + 1); |
There was a problem hiding this comment.
This is unrelated to this change but because of this change we can now do the API correctly? Is that right?
There was a problem hiding this comment.
Sorry, which API are you referring to?
|
At the moment scopes don't reference other scopes so I'm not seeing where the circular reference would come from. But @pradeepcrao also pointed out that we might hold onto |
Signed-off-by: Joshua Marantz <jmarantz@google.com>
|
/retest |
|
Retrying Azure Pipelines: |
|
/retest |
|
Retrying Azure Pipelines: |
Signed-off-by: Joshua Marantz <jmarantz@google.com>
…eSharedPtr (#20888) Commit Message: Follow-up to #19791, #19790 and #20871, finishing off all remaining ScopePtr references. One more PR to go which will be to remove the ScopePtr definition, but first need to make sure this doesn't break any references outside the repo. Testing: //test/... Docs Changes: n/a Release Notes: n/a Platform Specific Features: n/a Signed-off-by: Joshua Marantz <jmarantz@google.com>
…eSharedPtr (envoyproxy#20888) Commit Message: Follow-up to envoyproxy#19791, envoyproxy#19790 and envoyproxy#20871, finishing off all remaining ScopePtr references. One more PR to go which will be to remove the ScopePtr definition, but first need to make sure this doesn't break any references outside the repo. Testing: //test/... Docs Changes: n/a Release Notes: n/a Platform Specific Features: n/a Signed-off-by: Joshua Marantz <jmarantz@google.com>
…eSharedPtr (#20888) Commit Message: Follow-up to envoyproxy/envoy#19791, #19790 and #20871, finishing off all remaining ScopePtr references. One more PR to go which will be to remove the ScopePtr definition, but first need to make sure this doesn't break any references outside the repo. Testing: //test/... Docs Changes: n/a Release Notes: n/a Platform Specific Features: n/a Signed-off-by: Joshua Marantz <jmarantz@google.com>
Commit Message: To support an algorithmic solution to a long burst of CPU on admin /stats with a large # clusters, we need to hold scopes in shared pointers. This PR makes that change and updates the usage within stats to reference scopes as
ScopeSharedPtrrather thanScopePtr, and addsenable_shared_from_thisas a Scope super-class so existing APIs that passScope&don't need to change.This leaves behind a
ScopePtralias toScopeSharedPtrtemporarily to (a) make this PR be reviewable and (b) avoid breaking other repos that may referenceScopePtr. It will be easier to submit this PR first, then a external repos can rename their references, and we can do a follow-up PR that removesScopePtrto avoid longer term confusion. That PR would like like #19790 which has 90 changed files in addition to the semantic change. Once this is merged, that PR will just be a strict rename and will be a lot easier to review.This PR blocks #19693 which implements the admin change requiring holding onto a shared scope ptr.
Additional Description:
Risk Level: low
Testing: //test/...
Docs Changes: n/a
Release Notes: n/a
Platform Specific Features: n/a