Skip to content

Fix race between activeStreams and bdp window size#5494

Merged
zasweq merged 9 commits intogrpc:masterfrom
zasweq:bug-fix
Jul 13, 2022
Merged

Fix race between activeStreams and bdp window size#5494
zasweq merged 9 commits intogrpc:masterfrom
zasweq:bug-fix

Conversation

@zasweq
Copy link
Copy Markdown
Contributor

@zasweq zasweq commented Jul 11, 2022

Fixes #5358.

The test case I added based off the issue test would successfully fail unexpected.EOF. This was caused by a race between when a stream was created and when updateFlowControl() was called. This fix makes sure any created streams has the correct window size on creation.

RELEASE NOTES:

  • Fixed dynamic BDP estimation causing unexpected EOF error by fixing race between activeStreams and bdp window size

@zasweq zasweq requested a review from dfawley July 11, 2022 17:27
@zasweq zasweq added this to the 1.49 Release milestone Jul 11, 2022
Comment thread internal/transport/http2_client.go Outdated
Comment on lines 751 to 757
if !checkForHeaderListSize(it) {
return false
}
if !checkForHeaderListSize(it) {
if !checkForStreamQuota(it) {
return false
}
return true
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can all be simply return checkForHeaderListSize(it) && checkForStreamQuota(it)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched.

Comment thread test/end2end_test.go Outdated
Comment on lines +8071 to +8072
if e, ok := status.FromError(err); ok {
if e.Code() != codes.ResourceExhausted {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is simpler:

if code := status.Code(err); code != codes.ResourceExhausted {
	...
}

Note also that your test fails to detect if the returned error from UnaryCall is a non-status error (this isn't ever expected, of course, but is technically possible).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched and wrote a comment about your note.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrote a comment about your note.

I don't understand. If you switched to my method instead of yours, my note doesn't apply anymore.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh duh true lol.

Copy link
Copy Markdown
Contributor

@easwars easwars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also probably need a better release note. Else, we will get back to this when we make the release.

Comment thread test/end2end_test.go Outdated
Comment on lines +8069 to +8070
_, err := ss.Client.UnaryCall(ctx, &testpb.SimpleRequest{ResponseSize: 4194304})
if err != nil {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: combine the assignment and the conditional statements.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment thread test/end2end_test.go Outdated
Comment on lines +8080 to +8081
_, err = ss.Client.UnaryCall(ctx, &testpb.SimpleRequest{ResponseSize: 275075})
if err != nil {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: combine the assignment and the conditional statements.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment thread test/end2end_test.go Outdated
// this is testing for is an unexpected EOF with a status code
// INTERNAL so this is fine.
if code := status.Code(err); code != codes.ResourceExhausted {
t.Fatalf("unexpected err in UnaryCall: %v", err)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you mention the received and expected codes in the error string.

	got, want := status.Code(err), codes.ResourceExhausted; got != want {
		t.Fatalf("UnaryCall RPC returned status code: %v, want %v", got, want)
	}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Printing the whole error is better than only printing the code, as there might be status message info that's useful:

		t.Fatalf("UnaryCall RPC returned error: %v, want status code %v", err, want)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched to this:
if code := status.Code(err); code != codes.ResourceExhausted { t.Fatalf("UnaryCall RPC returned error: %v, want status code %v", err, codes.ResourceExhausted) }

Comment thread test/end2end_test.go Outdated
// should work normally.
_, err = ss.Client.UnaryCall(ctx, &testpb.SimpleRequest{ResponseSize: 275075})
if err != nil {
t.Fatalf("unexpected err in UnaryCall: %v", err)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer concise error messages:

t.Fatalf("UnaryCall RPC failed: %v", err)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched.

@easwars easwars assigned zasweq and unassigned dfawley Jul 12, 2022
Comment thread internal/transport/http2_client.go Outdated
s.id = h.streamID
s.fc = &inFlow{limit: uint32(t.initialWindowSize)}
t.mu.Lock()
if t.activeStreams != nil { // Can be niled from Close()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if t.activeStreams == nil {
	return false
}
t.activeStreams[s.id] = s

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched.

Copy link
Copy Markdown
Contributor Author

@zasweq zasweq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments :D! I'm sure some tests will fail from this stream change though give it a second to look at again.

Comment thread test/end2end_test.go Outdated
// this is testing for is an unexpected EOF with a status code
// INTERNAL so this is fine.
if code := status.Code(err); code != codes.ResourceExhausted {
t.Fatalf("unexpected err in UnaryCall: %v", err)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched to this:
if code := status.Code(err); code != codes.ResourceExhausted { t.Fatalf("UnaryCall RPC returned error: %v, want status code %v", err, codes.ResourceExhausted) }

Comment thread internal/transport/http2_client.go Outdated
s.id = h.streamID
s.fc = &inFlow{limit: uint32(t.initialWindowSize)}
t.mu.Lock()
if t.activeStreams != nil { // Can be niled from Close()
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched.

@zasweq zasweq assigned easwars and dfawley and unassigned zasweq Jul 12, 2022
Comment thread internal/transport/http2_client.go Outdated
t.setGoAwayReason(f)
close(t.goAway)
t.controlBuf.put(&incomingGoAway{})
defer t.controlBuf.put(&incomingGoAway{})
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment explaining that we defer because t.mu is held

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added "Defer as t.mu is currently held".

Comment thread test/end2end_test.go Outdated
Comment on lines +8070 to +8072
// This check doesn't fail on a non status error. However, the main
// this is testing for is an unexpected EOF with a status code
// INTERNAL so this is fine.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete now. It does fail on a non-status error since code would be Unknown in that case.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops forgot deleted.

Comment thread test/end2end_test.go Outdated
for i := 0; i < 10; i++ {
// exceeds grpc.DefaultMaxRecvMessageSize, this should error with
// RESOURCE_EXHAUSTED error.
if _, err := ss.Client.UnaryCall(ctx, &testpb.SimpleRequest{ResponseSize: 4194304}); err != nil {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_, err := ss.Client.UnaryCall()
if got := status.Code(err); got != codes.ResourceExhausted {
	t.Fatal
}

This way you catch the case where the call doesn't fail (code will be OK), unexpectedly.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good point switched.

@dfawley dfawley assigned zasweq and unassigned dfawley Jul 12, 2022
Copy link
Copy Markdown
Contributor Author

@zasweq zasweq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments :D!

Comment thread test/end2end_test.go Outdated
Comment on lines +8070 to +8072
// This check doesn't fail on a non status error. However, the main
// this is testing for is an unexpected EOF with a status code
// INTERNAL so this is fine.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops forgot deleted.

Comment thread test/end2end_test.go Outdated
for i := 0; i < 10; i++ {
// exceeds grpc.DefaultMaxRecvMessageSize, this should error with
// RESOURCE_EXHAUSTED error.
if _, err := ss.Client.UnaryCall(ctx, &testpb.SimpleRequest{ResponseSize: 4194304}); err != nil {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good point switched.

Comment thread internal/transport/http2_client.go Outdated
t.setGoAwayReason(f)
close(t.goAway)
t.controlBuf.put(&incomingGoAway{})
defer t.controlBuf.put(&incomingGoAway{})
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added "Defer as t.mu is currently held".

@zasweq zasweq assigned dfawley and unassigned zasweq Jul 12, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

gRPC connection closing with unexpected EOF on high-latency connections

3 participants