bindings/go: Add memory usage test for streaming parser#181
Merged
Conversation
926abaa to
c3efd5a
Compare
Merged
2 tasks
b3a4ed6 to
f73a5d5
Compare
f73a5d5 to
b886c06
Compare
efritz
approved these changes
Jun 29, 2023
efritz
left a comment
Contributor
There was a problem hiding this comment.
Comments are just doc nits
efritz
approved these changes
Jun 29, 2023
2 tasks
coury-clark
pushed a commit
to sourcegraph/sourcegraph-public-snapshot
that referenced
this pull request
Jul 24, 2023
…55160) This should reduce memory usage significantly, as in the common case (no two documents share the same relative path), we will end up processing one document at a time. I've tested the new code with a 5.7GB Chromium index, and we're able to process it with even 100MB of memory (at the cost of increased GC pressure). We need to iterate over the index twice, first to get all external symbols, and then to process documents, as document processing requires access to the external symbols list. This means we need the ability to seek to the start again. I've implemented that as follows: - For small indexes, just read the index into a slice. - For large indexes, save the compressed index to a temporary file on disk, and rely on the GC and page cache to transparently drop pages earlier in the file when under memory pressure. I figured decompressing is cheap enough, that it doesn't make sense to have the extra I/O overhead of reading/writing the uncompressed index. Other changes: - Documents are no longer sorted by relative path during iteration The order of iteration is still deterministic though as it matches the order of documents in the index. Questions: - Should we add instrumentation see the memory usage at different stages of processing an index? - How do we add a memory usage test (either here or in the upstream scip bindings)? ([internal Slack discussion](https://sourcegraph.slack.com/archives/C3B3SDBMY/p1687751363185919)) ## Test plan - [x] Update existing tests - [x] Add low memory usage test -- added upstream. scip-code/scip#181 <br> Backport b98ca76 from #53828 Co-authored-by: Varun Gandhi <varun.gandhi@sourcegraph.com>
1 task
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
We rely on low memory usage in Sourcegraph. https://github.com/sourcegraph/sourcegraph/pull/53828
So add a test to avoid regressing memory usage.
Test plan
Added a new test