horizontal-scaling: message distributor service client by oschaaf · Pull Request #647 · envoyproxy/nighthawk

oschaaf · 2021-03-15T12:26:36Z

Adds the wire skeleton of the api for a service which distributes
messages to other load gen services & sinks.
Adds a gRPC client implementation + unit tests

Signed-off-by: Otto van der Schaaf ovanders@redhat.com

- Adds the wire skeleton of the api for a service which distributes messages to other load gen services & sinks. - Adds a gRPC client implementation + unit tests Signed-off-by: Otto van der Schaaf <ovanders@redhat.com>

…distributor-client Signed-off-by: Otto van der Schaaf <ovanders@redhat.com>

Signed-off-by: Otto van der Schaaf <ovanders@redhat.com>

dubious90 · 2021-03-17T01:40:09Z

@eric846 please review and assign back to me once done.

…distributor-client Signed-off-by: Otto van der Schaaf <ovanders@redhat.com>

Signed-off-by: Otto van der Schaaf <ovanders@redhat.com>

eric846

LGTM

eric846 · 2021-03-17T18:40:30Z

test/distributor/nighthawk_distributor_client_test.cc

+        EXPECT_CALL(*mock_reader_writer, Read(_)).WillOnce(Return(true)).WillOnce(Return(false));
+        // Capture the Nighthawk request DistributedRequest sends on the channel.
+        EXPECT_CALL(*mock_reader_writer, Write(_, _))
+            .WillOnce(::testing::DoAll(::testing::SaveArg<0>(&request), Return(true)));


nit: don't need ::testing::

Removed in e1d6f59

eric846 · 2021-03-17T18:41:49Z

test/distributor/nighthawk_distributor_client_test.cc

+        return mock_reader_writer;
+      });
+
+  ::nighthawk::DistributedRequest distributed_request;


nit: In the style guide I believe they want us to omit the leading :: except in using statements. There are also some occurrences of leading :: in other files.

Didn't know that, and you're right, this crept into other files. Cleaned it up here in e1d6f59, filed #652 to track cleaning up the other places.

eric846 · 2021-03-17T20:56:10Z

test/distributor/nighthawk_distributor_client_test.cc

+  absl::StatusOr<DistributedResponse> distributed_response_or =
+      client.DistributedRequest(mock_nighthawk_service_stub, distributed_request);
+  EXPECT_TRUE(distributed_response_or.ok());
+  ASSERT_TRUE(request.has_execution_request());


With protos it would be safe to skip these ASSERTs and go right to the last EXPECT.

(depending on how valuable it is to know which level was missing)

Eliminated these lines in e1d6f59, leaning towards less code

Signed-off-by: Otto van der Schaaf <ovanders@redhat.com>

dubious90 · 2021-03-18T02:42:52Z

api/distributor/distributor.proto

+// turn delegate to one or more other services for actual handling.
+message DistributedRequest {
+  oneof distributed_request_type {
+    client.ExecutionRequest execution_request = 1;


I'm not sure why we'd need to distribute sink requests among multiple sinks. Can you help clarify that a little?

My confusion is twofold here:

Don't sink requests just return values stored by previous execution requests? If DistributedResponse isn't even returning anything, why would you ever go through this medium?

Isn't the purpose of the sink to be a single point of truth where you can determine the whole set of results from multiple execution responses across multiple nighthawks? If so, isn't supporting a distributed architecture for them possibly defeating their very purpose?

DistributedResponse is empty in this PR, its contents will be added in follow up(s).

When a DistributedRequest wraps a SinkRequests, the number of services must equal precisely one. I can see the confusion, but it doesn't seem to fit into the declarative validation model to annotate this as such.

Some thoughts:

Initially the idea was to have the distributor be a "generic" service which could be described as one that would accept arbitrary messages and forward them as-is to 1..n services within the cluster, and subsequently stream back the replies (wrapping these reply messages to annotate them with which internal service was associated to replies on the stream). Consensus was that it would be better to not attempt the generic approach and instead be specific, hence the explicit wrapper messages we use for distributed requests & replies. With that, the possibility to address > 1 sinks through the distributor stands out, and I think that is what is causing confusion. I'd be up for iterating on suggestions to enhance the modelling here. The services field is shared across the message types that can be distributed, but the constraints on this field varies across message types, and right now this "knowledge" is implicit in clients consuming the distributor service.

For clarity and ensuring we're all on the same page, here's a diagram of what this is converging to:

Thank you for the explanation @oschaaf. I am assuming that when SinkRequest is sent, the client is trying to consume data stored in the sink service, rather than write them. With that I would like us to consider two suggestions that would remove this confusion.

Does the client need to access the Sink service via the Distributor service, or are they chained just for convenience? Assuming the Distributor doesn't provide any critical functionality to the client - there is nothing wrong with dialing the Sink from the client directly. Would that work?

If we do identify critical functionality that the Distributor needs to provide to the client when retrieving the results, I would suggest we make the approach more specific by defining a second RPC method solely for the purposes of getting the results from the Sink.

It is a convenience thing. Option 1. would be low impact to move to, I will lean the api in this PR up.

Done in 2206035

Updated diagram:

Thanks for updating here Otto!

Extended alternate/diagram that includes the adaptive load controller:

In this flow the adaptive load controller runs behind nighthawk_service, which avoids the need to teach the adaptive load controller about sink and distributor services. (nighthawk_service will do that on its behalf).

dubious90 · 2021-03-18T02:44:29Z

api/distributor/distributor.proto

+import "validate/validate.proto";
+
+// Perform an execution request through an intermediate service that will in
+// turn delegate to one or more other services for actual handling.


Can we clarify in this comment that we are duplicating one request across multiple services? (Distributing to my understanding could also mean that we are choosing one service from your list to send it to)

Updated this in 4185e3c, let me know if it looks better

dubious90 · 2021-03-18T03:01:16Z

include/nighthawk/distributor/nighthawk_distributor_client.h

+  virtual ~NighthawkDistributorClient() = default;
+
+  /**
+   * @brief Propagate messages to one or more other services for handling.


nit: I don't think we need brief here, because this is already only one line.

Fixed in 4185e3c

dubious90 · 2021-03-18T03:02:26Z

include/nighthawk/distributor/nighthawk_distributor_client.h

+   *
+   * @param nighthawk_distributor_stub Used to open a channel to the distributor service.
+   * @param distributed_request Provide the message that the distributor service should propagate.
+   * @return absl::StatusOr<::nighthawk::DistributedResponse> Either a status indicating failure, or


nit/optional: can we add a line break between the @params and @return? Ignore if this is unusual in nighthawk

Leaving this as-is, because I checked, and we don't tend to do that in this repository.

dubious90 · 2021-03-18T03:07:09Z

api/distributor/distributor.proto

+// response messages.
+service NighthawkDistributor {
+  // Propagate the message wrapped in DistributedRequest to one or more other services for handling.
+  rpc DistributedRequestStream(stream DistributedRequest) returns (stream DistributedResponse) {


are these just streams because ExecutionRequest/ExecutionResponse are streams?

For the response stream: when initiating a load test and requesting it to start on multiple load gen services, you'll get back a stream of replies from each instance.
For the request stream: a client can re-use the stream to subsequently initiate a load test and query results from the sink.

Signed-off-by: Otto van der Schaaf <ovanders@redhat.com>

Otto van der Schaaf added 3 commits March 15, 2021 13:25

horizontal-scaling: message distributor

e351355

- Adds the wire skeleton of the api for a service which distributes messages to other load gen services & sinks. - Adds a gRPC client implementation + unit tests Signed-off-by: Otto van der Schaaf <ovanders@redhat.com>

Merge remote-tracking branch 'upstream/main' into horizontal-scaling-…

c2b4665

…distributor-client Signed-off-by: Otto van der Schaaf <ovanders@redhat.com>

Clean up deps, doc interface.

b5a8515

Signed-off-by: Otto van der Schaaf <ovanders@redhat.com>

oschaaf marked this pull request as ready for review March 16, 2021 21:52

oschaaf added the waiting-for-review A PR waiting for a review. label Mar 16, 2021

oschaaf changed the title ~~horizontal-scaling: message distributor~~ horizontal-scaling: message distributor service client Mar 16, 2021

dubious90 requested a review from eric846 March 17, 2021 01:40

Otto van der Schaaf added 2 commits March 17, 2021 20:38

Merge remote-tracking branch 'upstream/main' into horizontal-scaling-…

1d26b6c

…distributor-client Signed-off-by: Otto van der Schaaf <ovanders@redhat.com>

Clean up

ea22b1c

Signed-off-by: Otto van der Schaaf <ovanders@redhat.com>

eric846 previously approved these changes Mar 17, 2021

View reviewed changes

eric846 added waiting-for-changes A PR waiting for comments to be resolved and changes to be applied. and removed waiting-for-review A PR waiting for a review. labels Mar 17, 2021

eric846 assigned dubious90 Mar 17, 2021

Review feedback

e1d6f59

Signed-off-by: Otto van der Schaaf <ovanders@redhat.com>

oschaaf dismissed eric846’s stale review via e1d6f59 March 17, 2021 21:50

oschaaf mentioned this pull request Mar 17, 2021

Fix namespace dereferencing according to the style guide #652

Closed

oschaaf added waiting-for-review A PR waiting for a review. and removed waiting-for-changes A PR waiting for comments to be resolved and changes to be applied. labels Mar 17, 2021

dubious90 mentioned this pull request Mar 18, 2021

Fix namespace dereferencing according to the style guide #653

Merged

dubious90 reviewed Mar 18, 2021

View reviewed changes

dubious90 added waiting-for-changes A PR waiting for comments to be resolved and changes to be applied. and removed waiting-for-review A PR waiting for a review. labels Mar 18, 2021

Review feedback

4185e3c

Signed-off-by: Otto van der Schaaf <ovanders@redhat.com>

oschaaf added waiting-for-review A PR waiting for a review. and removed waiting-for-changes A PR waiting for comments to be resolved and changes to be applied. labels Mar 18, 2021

review feedback

2206035

Signed-off-by: Otto van der Schaaf <ovanders@redhat.com>

oschaaf added waiting-for-changes A PR waiting for comments to be resolved and changes to be applied. and removed waiting-for-review A PR waiting for a review. waiting-for-changes A PR waiting for comments to be resolved and changes to be applied. labels Mar 18, 2021

oschaaf added the waiting-for-review A PR waiting for a review. label Mar 18, 2021

dubious90 approved these changes Mar 19, 2021

View reviewed changes

dubious90 merged commit 211b3b5 into envoyproxy:main Mar 19, 2021

Conversation

oschaaf commented Mar 15, 2021

Uh oh!

dubious90 commented Mar 17, 2021

Uh oh!

eric846 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants