Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

[Backport 5.1] grpc: searcher: fix race condition with server.send message#54506

Merged
BolajiOlajide merged 1 commit into
5.1from
backport-54500-to-5.1
Jul 3, 2023
Merged

[Backport 5.1] grpc: searcher: fix race condition with server.send message#54506
BolajiOlajide merged 1 commit into
5.1from
backport-54500-to-5.1

Conversation

@github-actions

Copy link
Copy Markdown
Contributor

original issue

Slack link: https://sourcegraph.slack.com/archives/C04HCK4K3DL/p1687381299999559

>I'm getting this error running some searches using the internal search client
>
>
> > insightsSearchClient searcher/search.go:120 searchFilesInRepo failed {"error": "rpc error: code = Unknown desc = transport: received unexpected content-type \"application/octet-stream\"", "repo": "github.com/sourcegraph/sourcegraph"} >
>

background

The documentation for grpc.ServerStream.SendMsg() states:

>
> // SendMsg sends a message. On error, SendMsg aborts the stream and the
> // error is returned directly.
> // ...
> // It is safe to have a goroutine calling SendMsg and another goroutine
> // calling RecvMsg on the same stream at the same time, but it is not safe
> // to call SendMsg on the same stream in different goroutines
.
> // ...

The gRPC searcher.Search provides a callback to the search implementation that tells it to invoke grpc.SendMsg after it finds a given match:

https://github.com/sourcegraph/sourcegraph/blob/57932ebfd7056abf5a8158979cecd42b51267414/cmd/searcher/internal/search/search_grpc.go#L17-L27

The search impementation, delegates to regexSearch:

https://github.com/sourcegraph/sourcegraph/blob/0018a1223d531f3448d5119f80b7d2cfef29f02f/cmd/searcher/internal/search/search.go#L286-L288

And regexSearch ends up invoking the callback concurrently across a worker pool of goroutines:

https://github.com/sourcegraph/sourcegraph/blob/f10b37d2e2fc49e8a5eba3f40a88808420a6968d/cmd/searcher/internal/search/search_regex.go#L324-L362

Since SendMsg is now called across multiple goroutines, the Write() method in the HTTP2 gRPC stack now unexpectedly has two threadings running it:

https://github.com/grpc/grpc-go/blob/2997e84fd8d18ddb000ac6736129b48b3c9773ec/internal/transport/handler_server.go#L304-L315

This makes it possible for one thread to say that it's going to write the headers and park before it actually does so. The other thread continues executing and ends up using the default application/octet-stream content type since the other thread hasn't set the correct content type yet:

https://github.com/golang/net/blob/daac0cec0cf964a628a29bb4b82940c225b921ed/http2/server.go#L2580-L2582

this pr

This PR fixes the issue by wrapping a mutex around regexSearch's use of the callback - now only one thread can call grpc.SendMsg at a time.

This PR also introduces additional monitoring to flag the original transport message as an "internal error" (see https://github.com/sourcegraph/sourcegraph/pull/51749). While such race conditions won't always manifest in this way, seeing this error is a certain sign that something is wrong.

Test plan

CI

Backport ca5304f from #54500

@sourcegraph-bot

Copy link
Copy Markdown
Contributor

📖 Storybook live preview

@BolajiOlajide BolajiOlajide merged commit 3a0e0bd into 5.1 Jul 3, 2023
@BolajiOlajide BolajiOlajide deleted the backport-54500-to-5.1 branch July 3, 2023 11:44
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants