core: attach debug information about stream to DEADLINE_EXCEEDED#5892
core: attach debug information about stream to DEADLINE_EXCEEDED#5892zhangkun83 merged 14 commits intogrpc:masterfrom
Conversation
…XCEEDED
Most cases the message will include a dump of the transport
attributes, which gives important information such as server
addresses.
Example error message:
deadline exceeded after 9699625ns. Channel state: READY. Stream: {delegate=[NettyClientStream attrs={remote-addr=local:in-process-1, io.grpc.internal.GrpcAttributes.securityLevel=NONE, io.grpc.internal.GrpcAttributes.clientEagAttrs={}, local-addr=local:E:c9ab2b05}]
Example output:
deadline exceeded after 8112071ns. TimeoutDetails{channel_state=READY, server_addr=foo.test.google.fr/127.0.0.1:44749}
| } | ||
|
|
||
| @Override | ||
| public void appendTimeoutDetails(ToStringHelper toStringHelper) { |
There was a problem hiding this comment.
TBH, I think StringBuilder is a better fit.
public void appendTimeoutDetails(StringBuilder buffer) {
buffer.append(key).append('=').append(value).append(',')
}The code that finally does toString can trim the last character. Alternatively, this could just be a Map<String, Object>.
There was a problem hiding this comment.
StringBuilder just makes implementation strictly more complex, as I would have to care for formatting. I would be willing to submit to List<String>, but the generic is a hassle with mockito.
Since nothing is ideal, and most look like abuse to some extent, I just created a custom type InsightBuilder. This would give us more flexibility in terms of formatting, just in case we want to make changes in the future.
ejona86
left a comment
There was a problem hiding this comment.
Overall, looks fair. It's unfortunate it has to be thread-safe, but I understand why, and it doesn't seem to bother the implementations much. Only concern is bookkeeping for retriable stream. I don't care about ToStringHelper vs StringBuilder. I'm fine with either.
| private final AtomicBoolean noMoreTransparentRetry = new AtomicBoolean(); | ||
|
|
||
| /** Keeps track of all substreams for informative purposes. */ | ||
| private final ConcurrentLinkedQueue<Substream> allSubstreams = |
There was a problem hiding this comment.
I'm concerned about how long we'll keep these streams alive. We could potentially be retaining an entire transport for "no reason."
I think we should add a new Collection<Status> or Collection<Status.Code> to State for the completed streams. The stream's Status can be easily passed to State.substreamClosed(Substream), and so when the stream is removed from drainedSubstreams its results will be present in the other list. There's no need for the other stream data for completed streams (like address) since it isn't normally provided.
There was a problem hiding this comment.
I have updated this PR according to your suggestion. Now it includes the status codes from all closed substreams. @dapengzhang0, according to javadoc on the drainedSubstreams field, it doesn't seem to always include unfinished substreams. Please advice on how to proceed.
There was a problem hiding this comment.
Per offline discussion, I have updated the code and added comments.
| * information. | ||
| */ | ||
| public final class InsightBuilder { | ||
| private final ArrayList<String> buffer = new ArrayList<String>(); |
There was a problem hiding this comment.
I don't think it matters here... it would add one more import line.
|
PTAL |
| @Override | ||
| public Void answer(InvocationOnMock in) { | ||
| InsightBuilder insight = (InsightBuilder) in.getArguments()[0]; | ||
| insight.appendKeyValue("server_addr", "2.2.2.2:443"); |
There was a problem hiding this comment.
existing terminology in our code I think points to this being a "backend_addr", not a server_addr. (or possibly remote_addr)
There was a problem hiding this comment.
Changed to remote_addr.
In grpc#5892 getAttributes() is called without any regard of timing. Currently DelayedStream.getAttributes() wil throw if called before passThrough was set. Just to be safe, we are removing that restriction and making it clear on the javadoc. On the other hand, we intend to keep the timing restriction on ClientCall.getAttributes().
In grpc#5892 getAttributes() is called without any regard of timing. Currently DelayedStream.getAttributes() wil throw if called before passThrough was set. Just to be safe, we are removing that restriction and making it clear on the javadoc. On the other hand, we intend to keep the timing restriction on ClientCall.getAttributes().
| public void appendTimeoutInsight(InsightBuilder insight) { | ||
| State currentState; | ||
| synchronized (lock) { | ||
| insight.appendKeyValue("closed", closedSubstreamsInsight); |
There was a problem hiding this comment.
closedSubstreamsInsight.toString()
It is very important that the insight is stringified within the lock. The insight implementation could have chosen to store the Object in the List and serialize at the end.
(Note: this could be considered an argument against passing Objects. Although we deal with this problem with logging already.)
There was a problem hiding this comment.
InsightBuilder already does that and that's documented in the javadoc.
| InsightBuilder substreamInsight = new InsightBuilder(); | ||
| sub.stream.appendTimeoutInsight(substreamInsight); | ||
| openSubstreamsInsight.append(substreamInsight); | ||
| insight.appendKeyValue("open", openSubstreamsInsight); |
There was a problem hiding this comment.
Shouldn't this be outside of the for loop?
There was a problem hiding this comment.
Good catch! Fixed.
| insight.appendKeyValue("buffered_nanos", streamSetTimeNanos - startTimeNanos); | ||
| realStream.appendTimeoutInsight(insight); | ||
| } else { | ||
| insight.append("waiting_for_connection"); |
There was a problem hiding this comment.
Nit: it'd be nice if these cases were more symmetric in the output string, moving this to the next line.
In #5892 getAttributes() is called without any regard of timing. Currently DelayedStream.getAttributes() wil throw if called before passThrough was set. Just to be safe, we are removing that restriction and making it clear on the javadoc. On the other hand, we intend to keep the timing restriction on ClientCall.getAttributes().
Works for #4740
AbstractClientStreaminclude remote address in insight if available.DelayedStreamadds buffered time, and the insight of real stream if it's set.RetriableStreaminsights outputs of Substreams..ManagedChannelImpladds channel state. Other information, like last error status, would need API changes toLoadBalancerExample error message:
or
This is related to #4776 but taking a more usage-specific approach.