Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

codeintel: Validate package names before insertion.#39444

Merged
varungandhi-src merged 2 commits into
mainfrom
vg/validate-before-insert
Jul 27, 2022
Merged

codeintel: Validate package names before insertion.#39444
varungandhi-src merged 2 commits into
mainfrom
vg/validate-before-insert

Conversation

@varungandhi-src

Copy link
Copy Markdown
Contributor

Fixes https://github.com/sourcegraph/sourcegraph/issues/33251, suggested by @mrnugget.

Test plan

Added test case.

@sourcegraph-bot

sourcegraph-bot commented Jul 26, 2022

Copy link
Copy Markdown
Contributor

Codenotify: Notifying subscribers in CODENOTIFY files for diff c664999...452e2dc.

Notify File(s)
@efritz enterprise/cmd/worker/internal/codeintel/indexing/dependency_sync_scheduler.go
enterprise/cmd/worker/internal/codeintel/indexing/dependency_sync_scheduler_test.go

// applying any normalization or necessary transformations that lsif uploads
// require for internal consistency.
func newPackage(pkg shared.Package) precise.Package {
func newPackage(pkg shared.Package) (*precise.Package, error) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its fine to return precise.Package here, as we can see theres a bit of extra work with *precise.Package, unless you really want the out: nil in tests

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not about the out: nil in the test case; returning nil prevents a caller from accidentally using an invalid pkg if there is an error (and if it gets used, there is a crash sooner rather than multiple steps later). I know it is unlikely in this specific case, but it seemed safer to do that rather than not.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opinion piece: There's precisely one case in Go where you are able to read the non-error value on a non-nil error return (and that's use of io.Reader).

Also it's a bit odd for a NewX method to fail. Could I interest you in a softer alternative such as MakeX?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify: is your suggestion that I rename the method or is the suggestion that I return a meaningless value in the err case?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, those were two separate points. Returning precise.Package{}, err is idiomatic, and ALSO a NewX constructor returning an error catches me up (and I've heard others comment on it). Not sure if there is widespread hate but it's something that bleeps in my head each time I see it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I'll change that in a follow-up PR.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(It's also fairly nit, I just wanted to communicate it)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, no take-backsies.

case dependencies.JVMPackagesScheme:
p.Name = strings.TrimPrefix(p.Name, "maven/")
p.Name = strings.ReplaceAll(p.Name, "/", ":")
case dependencies.NpmPackagesScheme:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why only do this for npm? Shouldn't we do it for all the package schemas? I assumed we could extend this table to have a name-validator function too and then validate every package we insert:

https://github.com/sourcegraph/sourcegraph/blob/9aa2dfcb86afe1562829d49c22975019cf6cf9cd/enterprise/cmd/worker/internal/codeintel/indexing/dependency_sync_scheduler.go#L25-L30

@varungandhi-src varungandhi-src Jul 26, 2022

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I will fix this tomorrow, we can use the ParsePackageFromName API.

Unrelated to that: I also noticed that we have extra custom code here:

func (s *npmPackagesSyncer) ParsePackageFromName(name reposource.PackageName) (reposource.Package, error) {
	return s.ParsePackageFromRepoName(api.RepoName("npm/" + strings.TrimPrefix(string(name), "@")))
}

This should be calling one of the npm-specific functions directly instead of modifying the text...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per Slack, I will do this cleanup after fixing https://github.com/sourcegraph/sourcegraph/issues/39489

@varungandhi-src varungandhi-src merged commit d8516e8 into main Jul 27, 2022
@varungandhi-src varungandhi-src deleted the vg/validate-before-insert branch July 27, 2022 13:22
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix illegal package names entering database

5 participants