Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

fix: Check (uploadID, path) pairs in bulk#63485

Merged
varungandhi-src merged 8 commits into
mainfrom
vg/find-doc-ids
Jun 28, 2024
Merged

fix: Check (uploadID, path) pairs in bulk#63485
varungandhi-src merged 8 commits into
mainfrom
vg/find-doc-ids

Conversation

@varungandhi-src

Copy link
Copy Markdown
Contributor

Fixes https://linear.app/sourcegraph/issue/GRAPH-700

Previously, there were two code paths which were checking if certain
upload IDs containing a specific repo-relative path. One was using the
better GetPathExists API, the another one that I recently added was
much more stupid, it was deserializing the full SCIP Document.

However, using GetPathExists in a loop is also not great; we should
push the loop into the database. This patch replaces both of those usage
sites with a single function that handles bulk filtering, and replaces
the GetPathExists lsifstore method with FindDocumentIDs.

Q: Why are we returning document IDs if the caller doesn't need them?
A: Getting the document ID is essentially "free", so I thought it would be
better to return a mapping instead of a set.

However, I didn't generalize the API to support multiple paths within
the same upload, as that would complicate the method signature somewhat.
(the argument and return types would be both collections.Set[Pair[int, UploadRelPath]]).

I'm open to changing the method to work with sets instead and renaming
it to CheckPathsExist.

Test plan

Modified GetPathExists test to use the new FindDocumentIDs API.

Changelog

@cla-bot cla-bot Bot added the cla-signed label Jun 26, 2024
@github-actions github-actions Bot added team/graph Graph Team (previously Code Intel/Language Tools/Language Platform) team/product-platform labels Jun 26, 2024
AssociatedIndexID *int `json:"associatedIndex"`
}

var _ core.UploadLike = &CompletedUpload{}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this because the other code currently works with []CompletedUpload, and I wanted to use that as []U where U implements UploadLike.

// The skipDBCheck function avoids consulting the database for certain uploads.
//
// The order of the returned slice matches the order of candidates.
func filterUploadsImpl[U core.UploadLike](

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is generic because:

  1. Reading and typing uploadsshared.CompletedUpload in multiple places gets old quickly.
  2. There's actually no need for the other fields, we only really care about the ID and the Root.

@kritzcreek kritzcreek left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for looking into this. Have a few comments (mostly docs related)

Comment thread internal/codeintel/codenav/service.go Outdated
Comment thread internal/codeintel/codenav/internal/lsifstore/lsifstore_documents.go Outdated
Comment thread internal/codeintel/codenav/internal/lsifstore/lsifstore_documents.go Outdated
Comment thread internal/codeintel/codenav/service.go Outdated
Comment thread internal/codeintel/codenav/service.go
@varungandhi-src varungandhi-src merged commit c92dc0f into main Jun 28, 2024
@varungandhi-src varungandhi-src deleted the vg/find-doc-ids branch June 28, 2024 08:31
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

cla-signed team/graph Graph Team (previously Code Intel/Language Tools/Language Platform) team/product-platform

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants