Skip to content
This repository was archived by the owner on Sep 26, 2023. It is now read-only.

Conversation

@mutianf
Copy link
Contributor

@mutianf mutianf commented Jun 23, 2021

Add a batchRequestThrottled(long throttledTimeMs) method in ApiTracer. The total throttled time is calculated in Batcher.
When a batch is sent, Batcher will create a BatchingCallContext with total throttled time, which will get recorded in TracedBatchingContextCallable.

@google-cla google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label Jun 23, 2021
@mutianf mutianf force-pushed the throttle branch 2 times, most recently from e9a6657 to 8c3048a Compare June 28, 2021 14:26
Copy link
Contributor

@igorbernstein2 igorbernstein2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm after comments

@mutianf mutianf marked this pull request as ready for review July 12, 2021 14:54
@mutianf mutianf requested a review from a team July 12, 2021 14:54
@mutianf mutianf requested a review from a team as a code owner July 12, 2021 14:54
Copy link
Contributor

@elharo elharo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feature is big enough to deserve an issue describing the use case and the proposed solution.

*
* <p>For internal use only.
*/
@InternalApi("For internal use by google-cloud-java clients only")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

who else would even use GAX? I would prefer to reserve internal api annotations for things internal to Gax.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a dataflow adapter would use this information to publish a counter to let the dataflow service know that the job is being throttled

final ApiFuture<ResponseT> batchResponse =
unaryCallable.futureCall(accumulatedBatch.builder.build());
final ApiFuture<ResponseT> batchResponse;
if (unaryCallable instanceof TracedBatchedContextCallable) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this instanceof? Instanceofs are areally not something we would want to use unless absolutely necessary, since it is usually an indication of mistakes in OOP design of the application.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a workaround.

The general idea here is that we want to export state out a batcher so that we can send it to opencensus and eventually to dataflow counters. We dont want to couple the implementation of the tracing to the batcher. So we want to pass the state as a call context. The only 2 approaches I can think of are:

  1. try to insert it in the existing ApiCallContext and have the next callable probe for it
  2. have a special callable and have Batcher probe it

The second approach ended up creating a lot less breakage and least amount of overhead

Copy link
Contributor

@vam-google vam-google Jul 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check the other comment about ApiCallContext vs BatcherCcallContext, but overall, the more I understand about this change the more I lean towards merging BatcherCallContext thing into ApiCallContext (it still may remain as an aggregated context ojbect with this exact BatcherCallContext name if you want)

The second approach ended up creating a lot less breakage and least amount of

Can you please clarify what is the extra overhead of the second approach over the first one?

Main issue with the first approach (currently implemented in this PR) is that it breaks chain of responsibility pattern, and if someone adds another wrapper on top of TracedBatchedContextCallable (which should be completely safe and typical thing to do in the chain of responsibility paradigm) it will break this code, because it makes an assumption that TracedBatchedCallContextCallable is always the root callable supplied to this BatcherImpl class (which it may be now, but somebody in the future may add some sort of AwesomeTracedBatchedCallContextCallable wrapping the TracedBatchedCallContextCallable).

}

void add(ElementT element, SettableApiFuture<ElementResultT> result) {
void add(ElementT element, SettableApiFuture<ElementResultT> result, long throttledTime) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an InternalApi, but it says that it is for use by google-cloud-java clients, so it is a breaking chagne for them. Can we make it in anon-breaking way (adding a method overload instead of directly changing the method signature).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this method is on a private class, so it doesnt leak to clients

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yes, sorry, I did not realize it belonged to private Batch class, and thought it was part of BatcherImpl itself.

* <p>This is public only for technical reasons.
*/
@InternalApi
public abstract class BatchedContextCallable<RequestT, ResponseT>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering, can we make some of those InternalApi classes package-private?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this class crosses the boundary between tracing and batching so its shared by 2 packages

@Override
public UnaryCallable<RequestT, ResponseT> withDefaultCallContext(
final ApiCallContext defaultCallContext) {
throw new UnsupportedOperationException("withDefaultCallContext() not implemented");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since callables are wrapped in other callables, forming a long "chain of responsibility" pattern it is basically very hard to predic tin which context this method will be called. I.e. if it throws UnsupportedOperatoinException, it is likely that it will actualy get thrown in some valid execution workflow. Can we actually implement this method?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, no. This is here to make sure that this class is not misused. We can't implement it generically because the Batcher wont be able to pass it the context

* Creates an {@link ApiTracer} and annotates batching context data. Performs a call
* asynchronously.
*/
public ApiFuture<ResponseT> futureCall(RequestT request, BatchedCallContext batchedCallContext) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can batchedCallContext just become a part of ApiCallContext? So we will not have to have two methods here, one of which (this one) does not even fit in the overall callables hierarchy (the idea is that callables have their implementation of futureCall(request, apiCallContext), but don't have their own special contexts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving this to ApiCallContext is an option. There are 2 approaches:

  1. allowing ApiCallContext to generically store random data (similar to CallOptions). This will have a bit of overhead and makes this a lot more dynamically typed and I think harder to debug
  2. adding a specific getter/setter for batching context - this will make a generic construct like ApiContext have to know about specific things like batching that not relevant most of the time.

Localizing the interaction between a batcher and the next callable seems to be the least of the 3 evils

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having another futureCall method for a callable to actually do the sole thing why this callable exists does not really make it fit in the chain of responsibility pattern. Instead of seamlessly doing its custom thing, the callable must be called explicitly using the other method (the one with BatchedCallContext), which is not present in other callables in the same chain. This is basically the exact reason, why there is the instanceof statement (the one I commented about) in the BatcherImpl class.

In other words, the current implementation does make the job done, but it does not use the existing architecture in its idiomatic way. Since using callable must be done via calling futureCall(RequestT request, ApiCallContext context), if the custom implementaiton requires some extra information, it must be supplied as part of the generic ApiCallContext. This is basically why ApiCallContext is called in such generic way "context" meaning it has whatever may be needed in the call chain.

Or think about it the other way: if somebody decides to add another wrapper on top of TracedBatchedContextCallable and pass that in BatherImpl it will simply break the logic because instanceof will not work anymore. Note, an ability to add arbitrary number of wrappers (each with its features, which it adds to the chain of other features provided by other callables) is a must feature in this whole chain of wrappers, otherwise it just falls apart.

In other words, if you know how to make BatchedCallContext a part of ApiCallContext, please go ahead and do it.

/** Calls the wrapped {@link UnaryCallable} within the context of a new trace. */
@Override
public ApiFuture futureCall(RequestT request, ApiCallContext context) {
ApiCallContext mergedContext = baseCallContext.merge(context);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there is a lot of duplication between these two futureCall methods. Can you please rewrite them such that the code is reused, instead of being duplicated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored the methods.

Copy link
Contributor

@vam-google vam-google left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks good except the instanceof and non-idiomatic potentially breaking usage of the chain-of-responsibility pattern looks good. Please address those comments first.

* Creates an {@link ApiTracer} and annotates batching context data. Performs a call
* asynchronously.
*/
public ApiFuture<ResponseT> futureCall(RequestT request, BatchedCallContext batchedCallContext) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having another futureCall method for a callable to actually do the sole thing why this callable exists does not really make it fit in the chain of responsibility pattern. Instead of seamlessly doing its custom thing, the callable must be called explicitly using the other method (the one with BatchedCallContext), which is not present in other callables in the same chain. This is basically the exact reason, why there is the instanceof statement (the one I commented about) in the BatcherImpl class.

In other words, the current implementation does make the job done, but it does not use the existing architecture in its idiomatic way. Since using callable must be done via calling futureCall(RequestT request, ApiCallContext context), if the custom implementaiton requires some extra information, it must be supplied as part of the generic ApiCallContext. This is basically why ApiCallContext is called in such generic way "context" meaning it has whatever may be needed in the call chain.

Or think about it the other way: if somebody decides to add another wrapper on top of TracedBatchedContextCallable and pass that in BatherImpl it will simply break the logic because instanceof will not work anymore. Note, an ability to add arbitrary number of wrappers (each with its features, which it adds to the chain of other features provided by other callables) is a must feature in this whole chain of wrappers, otherwise it just falls apart.

In other words, if you know how to make BatchedCallContext a part of ApiCallContext, please go ahead and do it.

final ApiFuture<ResponseT> batchResponse =
unaryCallable.futureCall(accumulatedBatch.builder.build());
final ApiFuture<ResponseT> batchResponse;
if (unaryCallable instanceof TracedBatchedContextCallable) {
Copy link
Contributor

@vam-google vam-google Jul 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check the other comment about ApiCallContext vs BatcherCcallContext, but overall, the more I understand about this change the more I lean towards merging BatcherCallContext thing into ApiCallContext (it still may remain as an aggregated context ojbect with this exact BatcherCallContext name if you want)

The second approach ended up creating a lot less breakage and least amount of

Can you please clarify what is the extra overhead of the second approach over the first one?

Main issue with the first approach (currently implemented in this PR) is that it breaks chain of responsibility pattern, and if someone adds another wrapper on top of TracedBatchedContextCallable (which should be completely safe and typical thing to do in the chain of responsibility paradigm) it will break this code, because it makes an assumption that TracedBatchedCallContextCallable is always the root callable supplied to this BatcherImpl class (which it may be now, but somebody in the future may add some sort of AwesomeTracedBatchedCallContextCallable wrapping the TracedBatchedCallContextCallable).

}

void add(ElementT element, SettableApiFuture<ElementResultT> result) {
void add(ElementT element, SettableApiFuture<ElementResultT> result, long throttledTime) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yes, sorry, I did not realize it belonged to private Batch class, and thought it was part of BatcherImpl itself.

@mutianf mutianf closed this Aug 16, 2021
@mutianf mutianf deleted the throttle branch August 16, 2021 14:16
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

cla: yes This human has signed the Contributor License Agreement.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants