Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

gitserver: Use git-diff-tree to simplify validation and improve security#62709

Merged
eseliger merged 1 commit into
mainfrom
es/05-16-gitserverusegit-diff-treetosimplifyvalidationandimprovesecurity
May 16, 2024
Merged

gitserver: Use git-diff-tree to simplify validation and improve security#62709
eseliger merged 1 commit into
mainfrom
es/05-16-gitserverusegit-diff-treetosimplifyvalidationandimprovesecurity

Conversation

@eseliger

@eseliger eseliger commented May 15, 2024

Copy link
Copy Markdown
Member

At some point in the past, we noticed that passing /etc/passwd and /dev/null to git diff allows to view the file contents, even though it's not part of the repo itself.
That allows to read arbitrary files on disk, so we had to add in-app validation for file paths, including relative paths.
That makes the validation much more complex and harder to reason about, and requires that we didn't make a mistake.

Instead, we switch to git diff-tree which validates paths are part of the tree properly.
By that, we can remove a bunch of complexity around git diff argument validation.

See here for the difference:

➜  sourcegraph git:(main) git diff-tree --find-renames --full-index --inter-hunk-context=3 --no-prefix -- /etc/hosts /dev/null
fatal: /etc/hosts: '/etc/hosts' is outside repository at '/Users/erik/Code/sourcegraph/sourcegraph'
➜  sourcegraph git:(main) git --no-pager diff --find-renames --full-index --inter-hunk-context=3 --no-prefix -- /etc/hosts /dev/null
diff --git etc/hosts etc/hosts
deleted file mode 100644
index deadbeef..0000000000000000000000000000000000000000
--- etc/hosts
+++ /dev/null
@@ -1,16 +0,0 @@
-##
-# Host Database
-#
-# localhost is used to configure the loopback interface
-# when the system is booting.  Do not change this entry.
-##
-127.0.0.1	localhost

This PR also skips over arguments past the -- boundary which in git means "everything after this will not be interpreted as an argument".
This previously caused git cat-file HEAD -- -file.txt to fail argument validation, although it's perfectly valid and not a risk.

Test plan:

Adjusted the tests for argument validation and existing tests for diff and integration and E2E tests are still passing.

@cla-bot cla-bot Bot added the cla-signed label May 15, 2024

Copy link
Copy Markdown
Member Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @eseliger and the rest of your teammates on Graphite Graphite

@github-actions github-actions Bot added team/product-platform team/source Tickets under the purview of Source - the one Source to graph it all labels May 15, 2024
@eseliger eseliger force-pushed the es/05-16-gitserverusegit-diff-treetosimplifyvalidationandimprovesecurity branch 3 times, most recently from 1da530f to 66a5b51 Compare May 15, 2024 23:47
@eseliger eseliger marked this pull request as ready for review May 15, 2024 23:58
@eseliger eseliger requested review from a team May 15, 2024 23:58

@evict evict left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! 👍

Comment thread cmd/gitserver/internal/git/gitcli/diff.go Outdated

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥

At some point in the past, we noticed that passing /dev/passwd and /dev/null to `git diff` allows to view the file contents, even though it's not part of the repo itself.
That allows to read arbitrary files on disk, so we had to add in-app validation for file paths, including relative paths.
That makes the validation much more complex and harder to reason about, and requires that we didn't make a mistake.

Instead, we switch to `git diff-tree` which validates paths are part of the tree properly.
By that, we can remove a bunch of complexity around git diff argument validation.

See here for the difference:

```
➜  sourcegraph git:(main) git diff-tree --find-renames --full-index --inter-hunk-context=3 --no-prefix -- /etc/hosts /dev/null
fatal: /etc/hosts: '/etc/hosts' is outside repository at '/Users/erik/Code/sourcegraph/sourcegraph'
➜  sourcegraph git:(main) git --no-pager diff --find-renames --full-index --inter-hunk-context=3 --no-prefix -- /etc/hosts /dev/null
diff --git etc/hosts etc/hosts
deleted file mode 100644
index deadbeef..0000000000000000000000000000000000000000
--- etc/hosts
+++ /dev/null
@@ -1,16 +0,0 @@
-##
-# Host Database
-#
-# localhost is used to configure the loopback interface
-# when the system is booting.  Do not change this entry.
-##
-127.0.0.1	localhost
```

This PR also skips over arguments past the `--` boundary which in git means "everything after this will not be interpreted as an argument".
This previously caused `git cat-file HEAD -- -file.txt` to fail argument validation, although it's perfectly valid and not a risk.

Test plan:

Adjusted the tests for argument validation and existing tests for diff and integration and E2E tests are still passing.
@eseliger eseliger force-pushed the es/05-16-gitserverusegit-diff-treetosimplifyvalidationandimprovesecurity branch from e32fc38 to d3fa027 Compare May 16, 2024 14:01
@eseliger eseliger merged commit 4932f0a into main May 16, 2024
@eseliger eseliger deleted the es/05-16-gitserverusegit-diff-treetosimplifyvalidationandimprovesecurity branch May 16, 2024 15:34
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

cla-signed team/product-platform team/source Tickets under the purview of Source - the one Source to graph it all

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants