Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

[Backport 5.1] grpc: symbols: spread LocalCodeIntel response symbols across multiple messages#55292

Merged
BolajiOlajide merged 1 commit into
5.1from
backport-55242-to-5.1
Jul 26, 2023
Merged

[Backport 5.1] grpc: symbols: spread LocalCodeIntel response symbols across multiple messages#55292
BolajiOlajide merged 1 commit into
5.1from
backport-55242-to-5.1

Conversation

@github-actions

Copy link
Copy Markdown
Contributor

This PR changes the LocalCodeIntel symbols gRPC method to return a stream of chunks of symbols, rather than the entire result set in a single message.


gRPC has default limits on how large individual messages can be. This is because gRPC is a message-based framework, which means that messages have to be entirely loaded into memory for both sending and receiving. As such, large allocations can negatively impact server performance and stability. The go-grpc default limit is 4MB.

The original LocalCodeIntel RPC implementation (both gRPC and REST) returned the entire set of returned symbols in one message. For certain repositories / files, this message can contains thousands of symbols, which can add up to hundreds of MB or even gigabytes of memory.

This PR adjusts the server-side implementation of LocalCodeIntel to chunk up its full result set into smaller chunks (so that each individual gRPC message is smaller). It does this in a few steps.


Note that this PR doesn't "fix" the real issues with the underlying application itself. Notably:

  • The server is still calculating the entire result set in one shot (requiring it to hold all the symbols in a contiguous chunk of memory), as opposed to sending out incremental progress.
  • Is it necessary for the server to return thousands upon thousands of symbols, or should it support some sort of "limit" parameter?
  • The custom symbols client API still returns all these symbols in one giant slice (by joining all the received messages), so it still allocates a lot of memory - same as before as opposed to some sort of API where the result can be consumed incrementally.

However, we still aren't any worse off than we were before (the REST implementation still has the same memory allocation problems and sends the results in one giant JSON message). This PR does "work around" the gRPC-specific message size complaints while providing a building block to improve this in the future.

cc @sourcegraph/code-intel ^

Test plan

  1. CI
  2. Manual testing:
diff --git a/internal/grpc/chunk/chunker.go b/internal/grpc/chunk/chunker.go
index 947b721368..c954854fc6 100644
--- a/internal/grpc/chunk/chunker.go
+++ b/internal/grpc/chunk/chunker.go
@@ -42,7 +42,7 @@ type Chunker[T Message] struct {
}

// maxMessageSize is the maximum size per protobuf message
-const maxMessageSize = 1 * 1024 * 1024
+const maxMessageSize = 1 * 1024 * 1024 * 1024 * 1024
  • Nagivate to the above-mentioned repository and file, and hover over any token - see the expected log message:
 [       frontend] ERROR symbolsConnectionCache.gRPC.internal.error.reporter.streamingMethod.postMessageReceive internalerrs/logging.go:239 grpc: received message larger than max (708588568 vs. 94371840) {"grpcService": "symbols.v1.SymbolsService", "grpcMethod": "LocalCodeIntel", "grpcCode": "ResourceExhausted", "initialRequestJSON": "[omitted]"}

… messages (#55242)

Co-authored-by: Camden Cheek <camden@ccheek.com>
(cherry picked from commit 2a6c569)
@sourcegraph-bot

Copy link
Copy Markdown
Contributor

📖 Storybook live preview

@BolajiOlajide BolajiOlajide merged commit 276b100 into 5.1 Jul 26, 2023
@BolajiOlajide BolajiOlajide deleted the backport-55242-to-5.1 branch July 26, 2023 14:04
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants