ratelimit: convert to Grpc::AsyncClient.#1267
Conversation
|
@mattklein123 This is WiP - I still need to add some more unit tests for coverage and the actual integration test that motivated this, but thought I would throw it out for comments early. |
mattklein123
left a comment
There was a problem hiding this comment.
A few comments but at a high level LGTM.
|
|
||
| option go_package = "ratelimit"; | ||
|
|
||
| option cc_generic_services = true; |
There was a problem hiding this comment.
It might be out of scope for this change, but if possible can we pull this proto from envoy-api and delete this file also? There is a GH open on that.
There was a problem hiding this comment.
Yeah, I think we want to cut to the v2 API when we switch to envoy-api, and there we have a different package namespace etc., so that's best left as future work.
| ASSERT(callbacks_); | ||
| channel_->cancel(); | ||
| ASSERT(callbacks_ != nullptr); | ||
| ASSERT(stream_ != nullptr); |
There was a problem hiding this comment.
nit: will crash on the next line in obvious way
| callbacks_->complete(status); | ||
| callbacks_ = nullptr; | ||
| } | ||
| LimitStatus limit_status = LimitStatus::OK; |
There was a problem hiding this comment.
Question: It seems kind of odd to do this logic in the context of onRemoteClose(). Why not onReceiveMessage() ? I guess the larger question is for unary APIs (which are very common) can we make coding against this easier vs. having to remember to close the stream, etc.?
There was a problem hiding this comment.
Yep, this is a good idea, I was planning to provide a unary RPC interface that didn't require users to keep track of the minutiae of stream state before finalizing the PR.
|
@mattklein123 This now works and is ready for review. It's pretty big - if you are OK reviewing as is that's great, otherwise I can try and tease apart stuff into smaller PRs. @msample I'm hoping this resolves your issue when using |
|
I'm working on fixing #1254 then I can review. |
mattklein123
left a comment
There was a problem hiding this comment.
very nice. few comments.
include/envoy/grpc/async_client.h
Outdated
| AsyncClientCallbacks<ResponseType>& callbacks, | ||
| const Optional<std::chrono::milliseconds>& timeout) PURE; | ||
| virtual AsyncStream<RequestType>* start(const Protobuf::MethodDescriptor& service_method, | ||
| AsyncStreamCallbacks<ResponseType>& callbacks, |
There was a problem hiding this comment.
Since we are now splitting streaming and unary, can we kill the timeout in the streaming case? IMO it doesn't make any sense. (I know the underlying AsyncClient stuff still has this problem, but we might as well fix it here?)
|
|
||
| grpc_stream->set_stream(http_stream); | ||
| grpc_stream->moveIntoList(std::move(grpc_stream), active_streams_); | ||
| return dynamic_cast<AsyncRequestImpl<RequestType, ResponseType>*>( |
There was a problem hiding this comment.
perf nit: This has cost. Typically I would just hold a local reference/pointer to the unique_ptr, then return it after you add to the list.
| LinkedObject<AsyncClientStreamImpl<RequestType, ResponseType>>::removeFromList( | ||
| parent_.active_streams_); | ||
| if (LinkedObject<AsyncStreamImpl<RequestType, ResponseType>>::inserted()) { | ||
| parent_.dispatcher_.deferredDelete( |
There was a problem hiding this comment.
You probably already looked at this, but just double check that the stream/request does not reference client in its destructor, or there might be ordering issues depending on how client is destructed. (I haven't looked at code but just pointing this out from past experience).
| // the immediate failure case. | ||
| if (inserted()) { | ||
| removeFromList(parent_.active_streams_); | ||
| dispatcher().deferredDelete(removeFromList(parent_.active_streams_)); |
There was a problem hiding this comment.
Same comment here about referencing client in destructor.
|
|
||
| typedef Grpc::AsyncRequestCallbacks<pb::lyft::ratelimit::RateLimitResponse> RateLimitAsyncCallbacks; | ||
|
|
||
| class GrpcClientImpl : public Client, public RateLimitAsyncCallbacks { |
There was a problem hiding this comment.
In looking at this code again, I realize that it's kind of dumb that we make a new "client" for every filter/request. Can you just drop a TODO in here while you are in here to optimize this at some point? (We should have thread local client per filter, and track outstanding requests).
|
|
||
| AsyncStream<RequestType>* start(const Protobuf::MethodDescriptor& service_method, | ||
| AsyncStreamCallbacks<ResponseType>& callbacks) override { | ||
| const Optional<std::chrono::milliseconds> no_timeout; |
There was a problem hiding this comment.
Why was this change needed to fix ASAN? This still feels kind of busted to me. Aren't we storing a reference to the timeout inside the AsyncStreamImpl? If so should this be at client scope?
There was a problem hiding this comment.
I checked and it does get copied inside the underlying Http::AsyncClient::Stream, in the owned RouteEntryImpl https://github.com/lyft/envoy/blob/0efa18c36d6f789562690b5053c2c4b00987979e/source/common/http/async_client_impl.h#L123.
More generally, this got me wondering about whether we should have a better convention for handling reference ownership semantics in Envoy - it's hard to tell from the type signature of a method alone whether it will store a reference for later use or copy. I think there's a fair bit of mixed use of this in Envoy.
There was a problem hiding this comment.
The Grpc::AsyncStreamImpl was not owning the timeout in period between object construction and passing to the Http::AsyncClient::Stream in initialize(), which is what triggered the ASAN fail.
There was a problem hiding this comment.
I see, it gets used in the context of initialize(). Sure open to suggestions on naming. Same problem applies to pointers or references. Not sure if there is a good naming scheme.
There was a problem hiding this comment.
A first thought is FooBar& for the case when a reference is not retained past the unwinding of the call stack, FooBarRetainedRef for when it is. But, that's super verbose. @dnoe for thoughts as well.
Automatic merge from submit-queue. Fix quota cache status assignment. **What this PR does / why we need it**:Fix a bug in quota amount check in MixerClientImpl::Check(). **Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes envoyproxy#1181 **Special notes for your reviewer**: **Release note**: ```release-note NONE ```
No description provided.