horizontal-scaling: message distributor service client#647
horizontal-scaling: message distributor service client#647dubious90 merged 8 commits intoenvoyproxy:mainfrom
Conversation
- Adds the wire skeleton of the api for a service which distributes messages to other load gen services & sinks. - Adds a gRPC client implementation + unit tests Signed-off-by: Otto van der Schaaf <ovanders@redhat.com>
…distributor-client Signed-off-by: Otto van der Schaaf <ovanders@redhat.com>
Signed-off-by: Otto van der Schaaf <ovanders@redhat.com>
|
@eric846 please review and assign back to me once done. |
…distributor-client Signed-off-by: Otto van der Schaaf <ovanders@redhat.com>
| EXPECT_CALL(*mock_reader_writer, Read(_)).WillOnce(Return(true)).WillOnce(Return(false)); | ||
| // Capture the Nighthawk request DistributedRequest sends on the channel. | ||
| EXPECT_CALL(*mock_reader_writer, Write(_, _)) | ||
| .WillOnce(::testing::DoAll(::testing::SaveArg<0>(&request), Return(true))); |
There was a problem hiding this comment.
nit: don't need ::testing::
| return mock_reader_writer; | ||
| }); | ||
|
|
||
| ::nighthawk::DistributedRequest distributed_request; |
There was a problem hiding this comment.
nit: In the style guide I believe they want us to omit the leading :: except in using statements. There are also some occurrences of leading :: in other files.
| absl::StatusOr<DistributedResponse> distributed_response_or = | ||
| client.DistributedRequest(mock_nighthawk_service_stub, distributed_request); | ||
| EXPECT_TRUE(distributed_response_or.ok()); | ||
| ASSERT_TRUE(request.has_execution_request()); |
There was a problem hiding this comment.
With protos it would be safe to skip these ASSERTs and go right to the last EXPECT.
There was a problem hiding this comment.
(depending on how valuable it is to know which level was missing)
There was a problem hiding this comment.
Eliminated these lines in e1d6f59, leaning towards less code
Signed-off-by: Otto van der Schaaf <ovanders@redhat.com>
api/distributor/distributor.proto
Outdated
| // turn delegate to one or more other services for actual handling. | ||
| message DistributedRequest { | ||
| oneof distributed_request_type { | ||
| client.ExecutionRequest execution_request = 1; |
There was a problem hiding this comment.
I'm not sure why we'd need to distribute sink requests among multiple sinks. Can you help clarify that a little?
My confusion is twofold here:
- Don't sink requests just return values stored by previous execution requests? If DistributedResponse isn't even returning anything, why would you ever go through this medium?
- Isn't the purpose of the sink to be a single point of truth where you can determine the whole set of results from multiple execution responses across multiple nighthawks? If so, isn't supporting a distributed architecture for them possibly defeating their very purpose?
There was a problem hiding this comment.
DistributedResponseis empty in this PR, its contents will be added in follow up(s).- When a
DistributedRequestwraps aSinkRequests, the number ofservicesmust equal precisely one. I can see the confusion, but it doesn't seem to fit into the declarative validation model to annotate this as such.
Some thoughts:
Initially the idea was to have the distributor be a "generic" service which could be described as one that would accept arbitrary messages and forward them as-is to 1..n services within the cluster, and subsequently stream back the replies (wrapping these reply messages to annotate them with which internal service was associated to replies on the stream). Consensus was that it would be better to not attempt the generic approach and instead be specific, hence the explicit wrapper messages we use for distributed requests & replies. With that, the possibility to address > 1 sinks through the distributor stands out, and I think that is what is causing confusion. I'd be up for iterating on suggestions to enhance the modelling here. The services field is shared across the message types that can be distributed, but the constraints on this field varies across message types, and right now this "knowledge" is implicit in clients consuming the distributor service.
For clarity and ensuring we're all on the same page, here's a diagram of what this is converging to:
There was a problem hiding this comment.
Thank you for the explanation @oschaaf. I am assuming that when SinkRequest is sent, the client is trying to consume data stored in the sink service, rather than write them. With that I would like us to consider two suggestions that would remove this confusion.
- Does the client need to access the Sink service via the Distributor service, or are they chained just for convenience? Assuming the Distributor doesn't provide any critical functionality to the client - there is nothing wrong with dialing the Sink from the client directly. Would that work?
- If we do identify critical functionality that the Distributor needs to provide to the client when retrieving the results, I would suggest we make the approach more specific by defining a second RPC method solely for the purposes of getting the results from the Sink.
There was a problem hiding this comment.
It is a convenience thing. Option 1. would be low impact to move to, I will lean the api in this PR up.
There was a problem hiding this comment.
Thanks for updating here Otto!
There was a problem hiding this comment.
api/distributor/distributor.proto
Outdated
| import "validate/validate.proto"; | ||
|
|
||
| // Perform an execution request through an intermediate service that will in | ||
| // turn delegate to one or more other services for actual handling. |
There was a problem hiding this comment.
Can we clarify in this comment that we are duplicating one request across multiple services? (Distributing to my understanding could also mean that we are choosing one service from your list to send it to)
There was a problem hiding this comment.
Updated this in 4185e3c, let me know if it looks better
| virtual ~NighthawkDistributorClient() = default; | ||
|
|
||
| /** | ||
| * @brief Propagate messages to one or more other services for handling. |
There was a problem hiding this comment.
nit: I don't think we need brief here, because this is already only one line.
| * | ||
| * @param nighthawk_distributor_stub Used to open a channel to the distributor service. | ||
| * @param distributed_request Provide the message that the distributor service should propagate. | ||
| * @return absl::StatusOr<::nighthawk::DistributedResponse> Either a status indicating failure, or |
There was a problem hiding this comment.
nit/optional: can we add a line break between the @params and @return? Ignore if this is unusual in nighthawk
There was a problem hiding this comment.
Leaving this as-is, because I checked, and we don't tend to do that in this repository.
| // response messages. | ||
| service NighthawkDistributor { | ||
| // Propagate the message wrapped in DistributedRequest to one or more other services for handling. | ||
| rpc DistributedRequestStream(stream DistributedRequest) returns (stream DistributedResponse) { |
There was a problem hiding this comment.
are these just streams because ExecutionRequest/ExecutionResponse are streams?
There was a problem hiding this comment.
For the response stream: when initiating a load test and requesting it to start on multiple load gen services, you'll get back a stream of replies from each instance.
For the request stream: a client can re-use the stream to subsequently initiate a load test and query results from the sink.
Signed-off-by: Otto van der Schaaf <ovanders@redhat.com>
Signed-off-by: Otto van der Schaaf <ovanders@redhat.com>



messages to other load gen services & sinks.
Signed-off-by: Otto van der Schaaf ovanders@redhat.com