-
Notifications
You must be signed in to change notification settings - Fork 104
feat: add batch throttled time to metrics #1413
Conversation
gax/src/main/java/com/google/api/gax/batching/BatchingCallContext.java
Outdated
Show resolved
Hide resolved
gax/src/main/java/com/google/api/gax/tracing/TracedBatchingContextCallable.java
Outdated
Show resolved
Hide resolved
e9a6657 to
8c3048a
Compare
igorbernstein2
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm after comments
gax/src/main/java/com/google/api/gax/tracing/BatchedContextCallable.java
Outdated
Show resolved
Hide resolved
gax/src/main/java/com/google/api/gax/tracing/BatchedContextCallable.java
Show resolved
Hide resolved
gax/src/main/java/com/google/api/gax/tracing/TracedBatchedContextCallable.java
Outdated
Show resolved
Hide resolved
elharo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feature is big enough to deserve an issue describing the use case and the proposed solution.
gax/src/main/java/com/google/api/gax/batching/BatchedCallContext.java
Outdated
Show resolved
Hide resolved
| * | ||
| * <p>For internal use only. | ||
| */ | ||
| @InternalApi("For internal use by google-cloud-java clients only") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
who else would even use GAX? I would prefer to reserve internal api annotations for things internal to Gax.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a dataflow adapter would use this information to publish a counter to let the dataflow service know that the job is being throttled
| final ApiFuture<ResponseT> batchResponse = | ||
| unaryCallable.futureCall(accumulatedBatch.builder.build()); | ||
| final ApiFuture<ResponseT> batchResponse; | ||
| if (unaryCallable instanceof TracedBatchedContextCallable) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this instanceof? Instanceofs are areally not something we would want to use unless absolutely necessary, since it is usually an indication of mistakes in OOP design of the application.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a workaround.
The general idea here is that we want to export state out a batcher so that we can send it to opencensus and eventually to dataflow counters. We dont want to couple the implementation of the tracing to the batcher. So we want to pass the state as a call context. The only 2 approaches I can think of are:
- try to insert it in the existing ApiCallContext and have the next callable probe for it
- have a special callable and have Batcher probe it
The second approach ended up creating a lot less breakage and least amount of overhead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check the other comment about ApiCallContext vs BatcherCcallContext, but overall, the more I understand about this change the more I lean towards merging BatcherCallContext thing into ApiCallContext (it still may remain as an aggregated context ojbect with this exact BatcherCallContext name if you want)
The second approach ended up creating a lot less breakage and least amount of
Can you please clarify what is the extra overhead of the second approach over the first one?
Main issue with the first approach (currently implemented in this PR) is that it breaks chain of responsibility pattern, and if someone adds another wrapper on top of TracedBatchedContextCallable (which should be completely safe and typical thing to do in the chain of responsibility paradigm) it will break this code, because it makes an assumption that TracedBatchedCallContextCallable is always the root callable supplied to this BatcherImpl class (which it may be now, but somebody in the future may add some sort of AwesomeTracedBatchedCallContextCallable wrapping the TracedBatchedCallContextCallable).
| } | ||
|
|
||
| void add(ElementT element, SettableApiFuture<ElementResultT> result) { | ||
| void add(ElementT element, SettableApiFuture<ElementResultT> result, long throttledTime) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an InternalApi, but it says that it is for use by google-cloud-java clients, so it is a breaking chagne for them. Can we make it in anon-breaking way (adding a method overload instead of directly changing the method signature).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this method is on a private class, so it doesnt leak to clients
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, yes, sorry, I did not realize it belonged to private Batch class, and thought it was part of BatcherImpl itself.
| * <p>This is public only for technical reasons. | ||
| */ | ||
| @InternalApi | ||
| public abstract class BatchedContextCallable<RequestT, ResponseT> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wondering, can we make some of those InternalApi classes package-private?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this class crosses the boundary between tracing and batching so its shared by 2 packages
| @Override | ||
| public UnaryCallable<RequestT, ResponseT> withDefaultCallContext( | ||
| final ApiCallContext defaultCallContext) { | ||
| throw new UnsupportedOperationException("withDefaultCallContext() not implemented"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since callables are wrapped in other callables, forming a long "chain of responsibility" pattern it is basically very hard to predic tin which context this method will be called. I.e. if it throws UnsupportedOperatoinException, it is likely that it will actualy get thrown in some valid execution workflow. Can we actually implement this method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, no. This is here to make sure that this class is not misused. We can't implement it generically because the Batcher wont be able to pass it the context
| * Creates an {@link ApiTracer} and annotates batching context data. Performs a call | ||
| * asynchronously. | ||
| */ | ||
| public ApiFuture<ResponseT> futureCall(RequestT request, BatchedCallContext batchedCallContext) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can batchedCallContext just become a part of ApiCallContext? So we will not have to have two methods here, one of which (this one) does not even fit in the overall callables hierarchy (the idea is that callables have their implementation of futureCall(request, apiCallContext), but don't have their own special contexts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving this to ApiCallContext is an option. There are 2 approaches:
- allowing ApiCallContext to generically store random data (similar to CallOptions). This will have a bit of overhead and makes this a lot more dynamically typed and I think harder to debug
- adding a specific getter/setter for batching context - this will make a generic construct like ApiContext have to know about specific things like batching that not relevant most of the time.
Localizing the interaction between a batcher and the next callable seems to be the least of the 3 evils
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having another futureCall method for a callable to actually do the sole thing why this callable exists does not really make it fit in the chain of responsibility pattern. Instead of seamlessly doing its custom thing, the callable must be called explicitly using the other method (the one with BatchedCallContext), which is not present in other callables in the same chain. This is basically the exact reason, why there is the instanceof statement (the one I commented about) in the BatcherImpl class.
In other words, the current implementation does make the job done, but it does not use the existing architecture in its idiomatic way. Since using callable must be done via calling futureCall(RequestT request, ApiCallContext context), if the custom implementaiton requires some extra information, it must be supplied as part of the generic ApiCallContext. This is basically why ApiCallContext is called in such generic way "context" meaning it has whatever may be needed in the call chain.
Or think about it the other way: if somebody decides to add another wrapper on top of TracedBatchedContextCallable and pass that in BatherImpl it will simply break the logic because instanceof will not work anymore. Note, an ability to add arbitrary number of wrappers (each with its features, which it adds to the chain of other features provided by other callables) is a must feature in this whole chain of wrappers, otherwise it just falls apart.
In other words, if you know how to make BatchedCallContext a part of ApiCallContext, please go ahead and do it.
| /** Calls the wrapped {@link UnaryCallable} within the context of a new trace. */ | ||
| @Override | ||
| public ApiFuture futureCall(RequestT request, ApiCallContext context) { | ||
| ApiCallContext mergedContext = baseCallContext.merge(context); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like there is a lot of duplication between these two futureCall methods. Can you please rewrite them such that the code is reused, instead of being duplicated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored the methods.
vam-google
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything looks good except the instanceof and non-idiomatic potentially breaking usage of the chain-of-responsibility pattern looks good. Please address those comments first.
| * Creates an {@link ApiTracer} and annotates batching context data. Performs a call | ||
| * asynchronously. | ||
| */ | ||
| public ApiFuture<ResponseT> futureCall(RequestT request, BatchedCallContext batchedCallContext) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having another futureCall method for a callable to actually do the sole thing why this callable exists does not really make it fit in the chain of responsibility pattern. Instead of seamlessly doing its custom thing, the callable must be called explicitly using the other method (the one with BatchedCallContext), which is not present in other callables in the same chain. This is basically the exact reason, why there is the instanceof statement (the one I commented about) in the BatcherImpl class.
In other words, the current implementation does make the job done, but it does not use the existing architecture in its idiomatic way. Since using callable must be done via calling futureCall(RequestT request, ApiCallContext context), if the custom implementaiton requires some extra information, it must be supplied as part of the generic ApiCallContext. This is basically why ApiCallContext is called in such generic way "context" meaning it has whatever may be needed in the call chain.
Or think about it the other way: if somebody decides to add another wrapper on top of TracedBatchedContextCallable and pass that in BatherImpl it will simply break the logic because instanceof will not work anymore. Note, an ability to add arbitrary number of wrappers (each with its features, which it adds to the chain of other features provided by other callables) is a must feature in this whole chain of wrappers, otherwise it just falls apart.
In other words, if you know how to make BatchedCallContext a part of ApiCallContext, please go ahead and do it.
| final ApiFuture<ResponseT> batchResponse = | ||
| unaryCallable.futureCall(accumulatedBatch.builder.build()); | ||
| final ApiFuture<ResponseT> batchResponse; | ||
| if (unaryCallable instanceof TracedBatchedContextCallable) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check the other comment about ApiCallContext vs BatcherCcallContext, but overall, the more I understand about this change the more I lean towards merging BatcherCallContext thing into ApiCallContext (it still may remain as an aggregated context ojbect with this exact BatcherCallContext name if you want)
The second approach ended up creating a lot less breakage and least amount of
Can you please clarify what is the extra overhead of the second approach over the first one?
Main issue with the first approach (currently implemented in this PR) is that it breaks chain of responsibility pattern, and if someone adds another wrapper on top of TracedBatchedContextCallable (which should be completely safe and typical thing to do in the chain of responsibility paradigm) it will break this code, because it makes an assumption that TracedBatchedCallContextCallable is always the root callable supplied to this BatcherImpl class (which it may be now, but somebody in the future may add some sort of AwesomeTracedBatchedCallContextCallable wrapping the TracedBatchedCallContextCallable).
| } | ||
|
|
||
| void add(ElementT element, SettableApiFuture<ElementResultT> result) { | ||
| void add(ElementT element, SettableApiFuture<ElementResultT> result, long throttledTime) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, yes, sorry, I did not realize it belonged to private Batch class, and thought it was part of BatcherImpl itself.
Add a
batchRequestThrottled(long throttledTimeMs)method inApiTracer. The total throttled time is calculated inBatcher.When a batch is sent,
Batcherwill create aBatchingCallContextwith total throttled time, which will get recorded inTracedBatchingContextCallable.