Skip to content

perf(spanner): Skip gRPC trailers for StreamingRead & ExecuteStreamingSql#11854

Merged
sakthivelmanii merged 4 commits intomainfrom
skip_grpc_trailers
Apr 14, 2025
Merged

perf(spanner): Skip gRPC trailers for StreamingRead & ExecuteStreamingSql#11854
sakthivelmanii merged 4 commits intomainfrom
skip_grpc_trailers

Conversation

@sakthivelmanii
Copy link
Copy Markdown
Contributor

@sakthivelmanii sakthivelmanii commented Mar 17, 2025

Description:
gRPC trailers are headers that are sent after all the data messages are sent by the server in the stream. Trailers cause an additional latency of 250-300 μs after the last message in the stream. From the server, we will receive a last flag that tells us whether a PartialResultSet is the last in the stream, and we no longer need to wait for trailers to assume that all messages have been received.

@sakthivelmanii sakthivelmanii requested review from a team March 17, 2025 13:44
@product-auto-label product-auto-label bot added the api: spanner Issues related to the Spanner API. label Mar 17, 2025
ResultSet *spannerpb.ResultSet
UpdateCount int64
ResumeTokens [][]byte
Last bool
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we rename this field to something like SetLastFlag. Now it seems like this flag is to indicate that this StatementResult is the last, but it has a different meaning than that.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

t.Fatalf("ID Mismatch in the result")
}
}
if noOfRows != 1 {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we make at least one of these tests a bit more interesting by having more than 1 row?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@sakthivelmanii sakthivelmanii force-pushed the skip_grpc_trailers branch 4 times, most recently from 1fecc1a to 1c32286 Compare March 17, 2025 14:39
Copy link
Copy Markdown
Contributor

@olavloite olavloite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, with some nits on the test cases

@sakthivelmanii sakthivelmanii force-pushed the skip_grpc_trailers branch 2 times, most recently from d8e42ee to 53142d8 Compare March 17, 2025 17:20
@rahul2393 rahul2393 enabled auto-merge (squash) March 19, 2025 14:17
@rahul2393 rahul2393 disabled auto-merge March 19, 2025 14:19
@sakthivelmanii sakthivelmanii merged commit 10dc8b7 into main Apr 14, 2025
8 checks passed
@sakthivelmanii sakthivelmanii deleted the skip_grpc_trailers branch April 14, 2025 16:56
olavloite added a commit to googleapis/go-gorm-spanner that referenced this pull request Apr 28, 2025
Skip gRPC trailers at the end of successful queries.

See googleapis/google-cloud-go#11854
olavloite added a commit to googleapis/go-gorm-spanner that referenced this pull request Apr 28, 2025
Skip gRPC trailers at the end of successful queries.

See googleapis/google-cloud-go#11854
@yordis
Copy link
Copy Markdown

yordis commented Aug 1, 2025

Hey team, I am chasing an issue with Spanner library, where we get "context canceled" in the traces, and I tracked the situation to v1.80.0

It only happens for ExecuteStreamingSql

Any thoughts?

@rahul2393
Copy link
Copy Markdown
Contributor

@yordis Can you please try https://github.com/googleapis/google-cloud-go/releases/tag/spanner%2Fv1.81.1, it has fix for this case.

@yordis
Copy link
Copy Markdown

yordis commented Aug 1, 2025

I will try v1.81.1 but I was in v1.83.0 and it was failing, which is why I went back and did 1 release at a time.

@rahul2393
Copy link
Copy Markdown
Contributor

rahul2393 commented Aug 1, 2025

If it's happening in 1.83.0 then it means that issue is still happening.

Only possible case I can think of is, we cancel the context when doing iter.Stop()

r.cancel()

and then if we try to actually receive the message here
_, _ = s.Recv()
it will raise context cancel exception, basically it should depend on timing of when and what thread executes first.

@yordis in theory it should be happening for few ExecuteStreamingSql and not all the calls right?

@yordis
Copy link
Copy Markdown

yordis commented Aug 1, 2025

#12630 I just need help with the testing situation, I made it work and prove the brokeness, but still, I may need your help

@yordis
Copy link
Copy Markdown

yordis commented Aug 1, 2025

@yordis in theory it should be happening for few ExecuteStreamingSql and not all the calls right?

Yes, only ExecuteStreamingSql

If it's happening in 1.83.0 then it means that issue is still happening.

Yes

@yordis
Copy link
Copy Markdown

yordis commented Aug 1, 2025

Only possible case I can think of is, we cancel the context when doing iter.Stop()

I think we are doing defer iter.Stop() in the codebase, still, it was not reporting issues before this version, so the team lead decided to rollback

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: spanner Issues related to the Spanner API.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants