ratelimit: support detach request after sent limit request#40958
ratelimit: support detach request after sent limit request#40958yanavlasov merged 7 commits intoenvoyproxy:mainfrom
Conversation
70306a5 to
f1aece4
Compare
f1aece4 to
e3c02c7
Compare
|
Finally, this should be ready for review. |
Signed-off-by: WangBaiping <wbphub@gmail.com>
b43f70d to
4cb2fe6
Compare
|
|
||
| /** | ||
| * Detach the pending request. This will not cancel the request but will clean up | ||
| * all context associated with downstream request to avoid dangling references. | ||
| * NOTE: the callbacks that registered to take the response will be kept to handle | ||
| * the response when it arrives. The caller is responsible for ensuring that the | ||
| * callbacks have enough lifetime to handle the response. | ||
| */ | ||
| virtual void detach() PURE; |
There was a problem hiding this comment.
TBH this is very confusing. I don't really understand what we are clearing and why, and we are also not clearing response callbacks. What is the case that this is protecting against? Per your other PR can the lifetimes just be fixed externally in a more clear way?
There was a problem hiding this comment.
This is most to address the send-and-forget case. We send an async side request and never cancel it even the downstream main request is ended.
This will clear the downstream stream info, side stream water mark callbacks, parent span and so on to ensure after the in-flight request cannot access these because they maybe dangling pointers/references.
But the response callbacks will be kept by design then the caller could do some cleanup or operations after the response. And when send-and-forget mode is used, the caller self should ensure the response callbacks' lifetime, and it is pretty easy to provide a long living response callbacks.
Per your other PR can the lifetimes just be fixed externally in a more clear way?
It's pretty hard to do that for these. The lifetime of stream info (and the parent span and so on) are bound with the downstream main request. To longer their lifetime is super huge refactor because the stream info and parent span (and so on) will also store raw pointer or reference to other objects.
This is also why we cannot let the developer to resolve the lifetime problem by themselves. It's hard even for us.
Do a cleanup (to these potential dangling references) with a detach() would much simpler. I can try to update the comment make it more clear.
…esolve-lifetime-problem-of-limit
Signed-off-by: WangBaiping <wbphub@gmail.com>
|
/retest |
|
Friendly ping again cc @mattklein123 |
| return AsyncStreamImpl::streamInfo(); | ||
| } | ||
|
|
||
| void AsyncRequestImpl::detach() { |
There was a problem hiding this comment.
This is semantically the same as the AsyncStreamImpl::waitForRemoteCloseAndDelete() It should be possible to just call the waitForRemoteCloseAndDelete since it handles stuck requests with a timeout.
There was a problem hiding this comment.
@yanavlasov I have checked the waitForRemoteCloseAndDelete() and they are different actually for now. The waitForRemoteCloseAndDelete() skipped some of cleanup logic (but the timeout is a good design). And part of this PR also resolve some lifetime problem like metadata criteria in route.
But I agree they have similar goal and should could be merged with another PR.
WDYT?
|
OK this makes sense to me with the improved comment, but will defer to @yanavlasov who is more recent in the code. |
…esolve-lifetime-problem-of-limit
Signed-off-by: WangBaiping <wbphub@gmail.com>
|
/retest |
1 similar comment
|
/retest |
…esolve-lifetime-problem-of-limit
| absl::nullopt, getHitAddend()); | ||
| callbacks_->streamInfo(), getHitAddend()); | ||
| // If the limit() call fails directly then the detach() will be no-op. | ||
| shared_client->detach(); |
There was a problem hiding this comment.
Won't the callbacks_ pointer in the shared_client become dangling at this point?
There was a problem hiding this comment.
Nope. The callback pointer has as long lifetime as the whole request. It will clean up itself when the request is complete (no matter success or failure).
Signed-off-by: code <wbphub@gmail.com>
| // when calling the limit() function. To make sure we can call the detach() function | ||
| // safely, we convert the client_ to a shared_ptr. | ||
|
|
||
| std::shared_ptr<Filters::Common::RateLimit::Client> shared_client = std::move(client_); |
There was a problem hiding this comment.
Why do you need this dance with the shared_ptr though. It looks like the OnStreamDoneCallBack will be the only owner. Is it to call the detach() after the limit call?
There was a problem hiding this comment.
Yeah. It's possible the client may be deleted directly once the request failed directly and we have no way to know it out of the client. Keep a shared pointer to help us could always keep correct behavior. If the request failed directly, the detach() do nothing, and if the request still here, the detach() will do our expected cleanup.
|
Thank you very much @wbpcode! |
Commit Message: ratelimit: support detach request after sent limit request
Additional Description:
To close #40892. In the previous implementation, at stream complete phase, the substitution formatter couldn't works as expected for limit request. This is because no related context is provided to the grpc client to avoid potential lifetime problem.
This new implementation add a detach() method at
Grpc::AsyncRequestwhich will clean up the context (parent span, stream info, and so on). And then, at the rate limit filter, at stream complete phase, the stream info will be provided to the grpc client and then be cleaned up by the detach() to avoid potential dangling reference.Risk Level: low.
Testing: unit.
Docs Changes: n/a.
Release Notes: added.