git: worktree path hardening#2097
Conversation
d56a879 to
11b42cb
Compare
9f4e29f to
f114dc2
Compare
| if err := pathutil.ValidTreePath(filename); err != nil { | ||
| return err | ||
| } |
There was a problem hiding this comment.
Let's move this away from here and directly into index.Index. Combined index.Index and object.Tree should be able to restrict the points needed to enforce this across the board.
There was a problem hiding this comment.
Index.Add(path) now validates and returns (*Entry, error). As every add / move flow that records a path in the index funnels through here, the worktree-side chokepoint at addOrUpdateFileToIndex is now gone. Combined with the tree-side gates (FindEntry / TreeWalker.Next / TreeEntryFile), Index and Tree own this end-to-end.
| if err := pathutil.ValidTreePath(change.From.Name); err != nil { | ||
| return err | ||
| } |
There was a problem hiding this comment.
This could potentially be delegated at merkletrie/change.go. WDYT?
There was a problem hiding this comment.
I went one level up: TreeWalker.Next validates each leaf, so the Change reaching CherryPick is name-validated transitively.
Kept the gate at the walker since that's where tree entries surface. A malformed entry stops the walk before merkletrie ever sees it. Putting it in merkletrie/change.go would push tree-specific knowledge into the generic differ.
| from := ch.From.String() | ||
| if err := pathutil.ValidTreePath(from); err != nil { | ||
| return err | ||
| } |
There was a problem hiding this comment.
If check happens at Change, we remove the call here.
There was a problem hiding this comment.
Removed. Modify/Insert is covered by FindEntry one line down. Delete returns early via rmFileAndDirsIfEmpty against the tolerant wrapper, reached only from resetWorktree's filesystem-vs-index diff, where we want cleanup of legitimately-tracked shapes like submodule/.git to succeed rather than abort the whole reset.
Add a `parseConfigBool` helper that mirrors upstream Git's `git_parse_maybe_bool`: it accepts `true`/`yes`/`on`/`1` and `false`/`no`/`off`/`0` case-insensitively, and returns `OptBoolUnset` for empty or unrecognised values so the caller's platform default stays in place. `unmarshalCore` swaps `strconv.ParseBool` for `parseConfigBool` when reading `core.protectNTFS` and `core.protectHFS`. The previous `strconv.ParseBool` path silently misinterpreted user-friendly syntax: writing `protectNTFS = on` made `strconv.ParseBool` return an error, which left the field at its zero value (`OptBoolUnset`), so on Windows the platform default applied — but a user who wrote `protectNTFS = on` to deliberately enable the protection on Linux would have got the platform default (`false`) instead of the explicit `true` they intended. With the tolerant parser, all of `true` / `yes` / `on` / `1` (and their false counterparts) take effect where the user expects them to. Other booleans in this package keep the loose `== "true"` pattern; aligning them is out of scope. Only the security toggles are upgraded here, where silent misinterpretation has the highest cost. Reference: upstream Git `git_parse_maybe_bool_text` at `parse.c` L157-L173 and `git_parse_maybe_bool` at `parse.c` L174-L182 in tag `v2.54.0`[1]. `git_parse_maybe_bool` is the closer match, since it also accepts integer values via `git_parse_int`. [1]: https://github.com/git/git/blob/v2.54.0/parse.c#L157-L182 Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <hidde@hhh.computer>
Lift platform-specific dotgit-variant detection out of package `git` into a self-contained package so it can be reached from any caller — `config` (go-git#2079's submodule-name validation), `storage/filesystem/dotgit` (the `Module(name)` containment check), and the tree-side gates added in a follow-up commit — without those callers depending on the root `git` package. The package collects three layers: - HFS+ side: the ignored-codepoint table plus `IsHFSDot` / `IsHFSDotGit` / `IsHFSDotGitmodules` family (zero-width / case-folding aware). Implementations are unchanged from the previous `git`-package versions; only the package boundary moves. - NTFS / Windows side: `IsNTFSDot` (a port of canonical Git's `is_ntfs_dot_generic`), `IsNTFSDotGitmodules`, the `WindowsValidPath` predicate, and the reserved-name table (`CON`, `NUL`, `AUX`, …). The `dotGit = ".git"` constant is declared locally to avoid coupling pathutil to `git.GitDirName`. - Cross-platform helper: `IsDotGitName` matches `.git` and its 8.3 NTFS short alias `git~1` case-insensitively. On top of these primitives sits `ValidTreePath`, the strict validator applied at the boundary where attacker-controlled tree data leaves the trusted store. Where the wrapper-layer `validPath` in package `git` is intentionally tolerant of final-position `.git` (legitimate `submodule/.git` flows in submodule cleanup) and only consults HFS+/NTFS variants when the corresponding `core.protect*` flag is on, `ValidTreePath` is always-strict regardless of runtime config: tree paths are canonical UTF-8 with no zero-width characters or 8.3 short- name forms, so an entry that looks like one is suspicious anywhere. It rejects control characters, empty / `.` / `..` components, Windows volume-name prefixes, `.git` and its HFS+/NTFS variants at every position, and reserved device names. Mirrors upstream Git's `verify_path_internal` at `read-cache.c` L987-L1048 in tag `v2.54.0`[1], stripped of its runtime `protect_hfs` / `protect_ntfs` gating because the pathutil layer is consulted from the strict tree boundary, not from application paths. [1]: https://github.com/git/git/blob/v2.54.0/read-cache.c#L987-L1048 Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <hidde@hhh.computer>
Apply `pathutil.ValidTreePath` at the chokepoints where tree
data crosses out of the trusted object store and where
application-supplied paths cross into the index. Layered on
top of the existing tolerant wrapper `validPath` in package
`git`, this gives the worktree two layers of protection:
strict validation at the boundary, tolerant validation at
the filesystem edge for legitimate flows (`submodule/.git`
Stat / Remove during submodule cleanup).
Read-side chokepoints in `plumbing/object`:
- `(*Tree).FindEntry` — most callers funnel through here:
`(*Tree).File`, `(*Tree).Tree`, `(*Tree).Size`, and the
`checkoutChange` Modify/Insert branch. A dangerous tree-
derived path is refused at the lookup boundary before
anything materialises.
- `TreeWalker.Next` — drives `transformChildren` (which
feeds `merkletrie.DiffTree`), `FileIter`, and the archive
writers in `internal/archive`. Each leaf entry name is
validated as it surfaces; a malformed entry stops the
walk with the validator's error rather than skipping
silently. Inspection-only callers that need raw access
can still read `Tree.Entries` directly.
- `(*Tree).TreeEntryFile` — boundary where a `*File` whose
Name a caller can hand to filesystem ops leaves the
store.
Write-side chokepoint in `plumbing/format/index`:
- `Index.Add(path)` now validates and returns
`(*Entry, error)`. Every Add and Move flow that records
a path in the index funnels through here, mirroring
upstream Git's `verify_path_internal` invocation from
`make_cache_entry` on the index-addition side.
`Index.Add`'s signature change is a v6 API break.
Application-side gates in package `git`:
- The wrapper-level `validPath` continues to gate
filesystem writes; HFS+/NTFS-aware rejection of
`.gitmodules` symlink targets is now driven by the same
`pathutil` predicates so the wrapper and the strict
validator stay aligned. The control-character loop is
byte-oriented for upstream parity with
`verify_path_internal`.
- `Submodule.Repository`'s `Chroot` validates the
submodule's tree-stored Path before scoping the
repository, refusing embedded `.git` / HFS+ / NTFS
variants regardless of `core.protectHFS` /
`core.protectNTFS`.
- `Worktree.checkoutChange`'s Delete branch is reached
only from `resetWorktree`'s filesystem-vs-index
merkletrie diff (`resetWorktreeToTree`'s tree-derived
deletes call `rmFileAndDirsIfEmpty` directly). The path
source is the local worktree filesystem, where the
tolerant wrapper is the right fit: cleanup of
legitimately-tracked shapes like `submodule/.git`
should not abort the whole reset.
The root-level `hfs.go` / `ntfs.go` files held only the
3-line `defaultProtectHFS` / `defaultProtectNTFS` runtime-
policy helpers after the `pathutil` extraction; they fold
into `worktree_fs.go` next to the wrapper that consumes
them. The wrapper-integration tests (`TestValidPath*`,
`TestWorktreeFilesystem*` for HFS / NTFS variants) move
next to the wrapper they exercise in `worktree_fs_test.go`.
Tests pin every chokepoint against the catalogue of attacker
shapes used elsewhere in this branch — final-position `.git`
(`submodule/.git`), NTFS trailing-space / trailing-dot / ADS
(`.git ` / `.git.` / `.git::$INDEX_ALLOCATION`), HFS+ zero-
width `.git` (e.g. `.git`), Windows reserved `CON` /
`NUL`, the 8.3 short alias `git~1`, dot-dot traversal,
control chars, and the empty path. `Index.Add`'s
`TestIndexAdd` is updated for the new signature, and a new
`TestIndexAddRejectsDangerousPaths` pins the index gate.
Assisted-by: Claude Opus 4.7
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
012b2df to
5cfae1f
Compare
Restrict `ValidTreePath`'s NTFS gating to the disguised-`.git` family, dropping the always-on Windows reserved-name check that made go-git refuse trees upstream Git happily reads. In upstream's `verify_path_internal`[1], `is_ntfs_dotgit` runs under `protect_ntfs` (defaulting to 1 on every platform) but `is_valid_win32_path` is compile-time gated to Windows-native and Cygwin builds. Names such as `lib/con.go` are well-formed on non-Windows, so a go-git client on Linux must be able to read trees containing them. Lift `is_ntfs_dotgit` out as `IsNTFSDotGit` rather than keeping the disguise logic fused into `WindowsValidPath`. As a side-effect this closes a gap in the previous implementation: it only recognised the `.git` prefix, so `git~1 ` (trailing space), `git~1.`, and `git~1::ads` slipped past, even though upstream's `is_ntfs_dotgit`[2] also matches the `git~1` short-name prefix. `WindowsValidPath` now composes `IsNTFSDotGit` with the reserved-name table, retaining its existing wrapper-layer contract: bare `.git` and `git~1` are allowed, position-checked by callers. Defence in depth is preserved at the materialisation boundary: `worktreeFilesystem.validPath` still enforces both checks under `core.protectNTFS`, so reserved-name and disguise rejection remain in place when a path is about to hit disk on Windows. Tests in `plumbing/format/index`, `plumbing/object`, and the worktree strict-gate suite are updated to reflect the new contract: reserved names move out of the always-on tree gate, and `git~1` disguise variants are added alongside their `.git` counterparts. [1]: https://github.com/git/git/blob/v2.54.0/read-cache.c#L987-L1048 [2]: https://github.com/git/git/blob/v2.54.0/path.c#L1415-L1449 Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <hidde@hhh.computer>
Mirror upstream Git's `PROTECT_NTFS_DEFAULT`, which has been `1` unconditionally since 9102f958ee5 (CVE-2019-1353)[1]. Until now go-git gated the default on `runtime.GOOS == "windows"`, leaving Linux and macOS users without the wrapper-layer `is_ntfs_dotgit` and reserved-name checks unless they explicitly set `core.protectNTFS=true`. The motivating scenario is unchanged from upstream's: WSL mounts Windows drives under `/mnt/`, so a Linux process can reach an NTFS-backed worktree where the `.git` directory is also resolvable as `git~1` (or `.git ` / `.git::$DATA`). Gating the guard on the runtime OS skips that class of attack on the very system where it is reachable. Tree-side gates already catch disguised `.git` regardless of this default — `pathutil.ValidTreePath` is always-on per ce4cca1. This commit closes the parallel gap at the wrapper layer: `worktreeFilesystem.validPath` and `validSymlinkName` now enforce the NTFS rules on non-Windows by default, matching upstream's protect-by-default posture. `PROTECT_HFS_DEFAULT` is left untouched. Upstream chose not to flip that default in 9102f958ee5 (the cost in the cited benchmark was non-trivial and the WSL-equivalent scenario for HFS+ is not realistic), and `defaultProtectHFS` already mirrors that decision via its Darwin-only return. [1]: git/git@9102f958ee5 Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <hidde@hhh.computer>
Add a `parseConfigBool` helper that mirrors upstream Git's `git_parse_maybe_bool`: it accepts `true`/`yes`/`on`/`1` and `false`/`no`/`off`/`0` case-insensitively, and returns `OptBoolUnset` for empty or unrecognised values so the caller's platform default stays in place. `unmarshalCore` swaps `strconv.ParseBool` for `parseConfigBool` when reading `core.protectNTFS` and `core.protectHFS`. The previous `strconv.ParseBool` path silently misinterpreted user-friendly syntax: writing `protectNTFS = on` made `strconv.ParseBool` return an error, which left the field at its zero value (`OptBoolUnset`), so on Windows the platform default applied — but a user who wrote `protectNTFS = on` to deliberately enable the protection on Linux would have got the platform default (`false`) instead of the explicit `true` they intended. With the tolerant parser, all of `true` / `yes` / `on` / `1` (and their false counterparts) take effect where the user expects them to. Other booleans in this package keep the loose `== "true"` pattern; aligning them is out of scope. Only the security toggles are upgraded here, where silent misinterpretation has the highest cost. Reference: upstream Git `git_parse_maybe_bool_text` at `parse.c` L157-L173 and `git_parse_maybe_bool` at `parse.c` L174-L182 in tag `v2.54.0`[1]. `git_parse_maybe_bool` is the closer match, since it also accepts integer values via `git_parse_int`. [1]: https://github.com/git/git/blob/v2.54.0/parse.c#L157-L182 Backport of #2097. Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <hidde@hhh.computer>
Lift platform-specific dotgit-variant detection out of package `git` into a self-contained package so it can be reached from any caller — `config` (#2079's submodule-name validation), `storage/filesystem/dotgit` (the `Module(name)` containment check), and the tree-side gates added in a follow-up commit — without those callers depending on the root `git` package. The package collects three layers: - HFS+ side: the ignored-codepoint table plus `IsHFSDot` / `IsHFSDotGit` / `IsHFSDotGitmodules` family (zero-width / case-folding aware). Implementations are unchanged from the previous `git`-package versions; only the package boundary moves. - NTFS / Windows side: `IsNTFSDot` (a port of canonical Git's `is_ntfs_dot_generic`), `IsNTFSDotGitmodules`, the `WindowsValidPath` predicate, and the reserved-name table (`CON`, `NUL`, `AUX`, …). The `dotGit = ".git"` constant is declared locally to avoid coupling pathutil to `git.GitDirName`. - Cross-platform helper: `IsDotGitName` matches `.git` and its 8.3 NTFS short alias `git~1` case-insensitively. On top of these primitives sits `ValidTreePath`, the strict validator applied at the boundary where attacker-controlled tree data leaves the trusted store. Where the wrapper-layer `validPath` in package `git` is intentionally tolerant of final-position `.git` (legitimate `submodule/.git` flows in submodule cleanup) and only consults HFS+/NTFS variants when the corresponding `core.protect*` flag is on, `ValidTreePath` is always-strict regardless of runtime config: tree paths are canonical UTF-8 with no zero-width characters or 8.3 short- name forms, so an entry that looks like one is suspicious anywhere. It rejects control characters, empty / `.` / `..` components, Windows volume-name prefixes, `.git` and its HFS+/NTFS variants at every position, and reserved device names. Mirrors upstream Git's `verify_path_internal` at `read-cache.c` L987-L1048 in tag `v2.54.0`[1], stripped of its runtime `protect_hfs` / `protect_ntfs` gating because the pathutil layer is consulted from the strict tree boundary, not from application paths. [1]: https://github.com/git/git/blob/v2.54.0/read-cache.c#L987-L1048 Backport of #2097. Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <hidde@hhh.computer>
Apply `pathutil.ValidTreePath` at the chokepoints where tree
data crosses out of the trusted object store and where
application-supplied paths cross into the index. Layered on
top of the existing tolerant wrapper `validPath` in package
`git`, this gives the worktree two layers of protection:
strict validation at the boundary, tolerant validation at
the filesystem edge for legitimate flows (`submodule/.git`
Stat / Remove during submodule cleanup).
Read-side chokepoints in `plumbing/object`:
- `(*Tree).FindEntry` — most callers funnel through here:
`(*Tree).File`, `(*Tree).Tree`, `(*Tree).Size`, and the
`checkoutChange` Modify/Insert branch. A dangerous tree-
derived path is refused at the lookup boundary before
anything materialises.
- `TreeWalker.Next` — drives `transformChildren` (which
feeds `merkletrie.DiffTree`), `FileIter`, and the archive
writers. Each leaf entry name is validated as it
surfaces; a malformed entry stops the walk with the
validator's error rather than skipping silently.
Inspection-only callers that need raw access can still
read `Tree.Entries` directly.
- `(*Tree).TreeEntryFile` — boundary where a `*File` whose
Name a caller can hand to filesystem ops leaves the
store.
Write-side chokepoint in `worktree_status`:
- `doAddFileToIndex` validates the path via
`pathutil.ValidTreePath` before calling `Index.Add`.
Mirrors upstream Git's `verify_path_internal` invocation
from `make_cache_entry` on the index-addition side.
Diverges from upstream by keeping `Index.Add`'s existing
`(*Entry)` signature for v5 API compatibility — the gate
moves to the worktree caller, which is the only in-tree
`Index.Add(path)` call site.
Application-side gates in package `git`:
- The wrapper-level `validPath` continues to gate
filesystem writes; HFS+/NTFS-aware rejection of
`.gitmodules` symlink targets is now driven by the same
`pathutil` predicates so the wrapper and the strict
validator stay aligned. The control-character loop is
byte-oriented for upstream parity with
`verify_path_internal`.
- `Submodule.Repository`'s `Chroot` validates the
submodule's tree-stored Path before scoping the
repository, refusing embedded `.git` / HFS+ / NTFS
variants regardless of `core.protectHFS` /
`core.protectNTFS`.
- `Worktree.checkoutFileSymlink` no longer performs its
own `gitmodulesFile` check — `validSymlinkName` on the
wrapper covers it (and its NTFS / HFS variants).
The root-level `hfs.go` / `ntfs.go` files held only the
3-line `defaultProtectHFS` / `defaultProtectNTFS` runtime-
policy helpers after the `pathutil` extraction; they fold
into `worktree_fs.go` next to the wrapper that consumes
them. The local `windowsValidPath` test in `worktree_test.go`
is dropped — `pathutil.WindowsValidPath` has equivalent
coverage in `internal/pathutil/ntfs_test.go`.
Backport of #2097.
Bundled here are test fixes that surface the new gates:
`change_adaptor_test.go` sets `TreeEntry.Name` so
`TreeEntryFile`'s gate accepts the synthetic entries;
`submodule_test.go` sets `Path` on the synthetic submodules
that previously left it empty, and bypasses the wrapper
when planting the malicious `.gitmodules` symlink (the
read-side detection in `Submodules()` is the layer being
exercised, not the write-side gate).
Assisted-by: Claude Opus 4.7
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
Restrict `ValidTreePath`'s NTFS gating to the disguised-`.git` family, dropping the always-on Windows reserved-name check that made go-git refuse trees upstream Git happily reads. In upstream's `verify_path_internal`[1], `is_ntfs_dotgit` runs under `protect_ntfs` (defaulting to 1 on every platform) but `is_valid_win32_path` is compile-time gated to Windows-native and Cygwin builds. Names such as `lib/con.go` are well-formed on non-Windows, so a go-git client on Linux must be able to read trees containing them. Lift `is_ntfs_dotgit` out as `IsNTFSDotGit` rather than keeping the disguise logic fused into `WindowsValidPath`. As a side-effect this closes a gap in the previous implementation: it only recognised the `.git` prefix, so `git~1 ` (trailing space), `git~1.`, and `git~1::ads` slipped past, even though upstream's `is_ntfs_dotgit`[2] also matches the `git~1` short-name prefix. `WindowsValidPath` now composes `IsNTFSDotGit` with the reserved-name table, retaining its existing wrapper-layer contract: bare `.git` and `git~1` are allowed, position-checked by callers. Defence in depth is preserved at the materialisation boundary: `worktreeFilesystem.validPath` still enforces both checks under `core.protectNTFS`, so reserved-name and disguise rejection remain in place when a path is about to hit disk on Windows. [1]: https://github.com/git/git/blob/v2.54.0/read-cache.c#L987-L1048 [2]: https://github.com/git/git/blob/v2.54.0/path.c#L1415-L1449 Backport of #2097. Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <hidde@hhh.computer>
Mirror upstream Git's `PROTECT_NTFS_DEFAULT`, which has been `1` unconditionally since 9102f958ee5 (CVE-2019-1353)[1]. Until now go-git gated the default on `runtime.GOOS == "windows"`, leaving Linux and macOS users without the wrapper-layer `is_ntfs_dotgit` and reserved-name checks unless they explicitly set `core.protectNTFS=true`. The motivating scenario is unchanged from upstream's: WSL mounts Windows drives under `/mnt/`, so a Linux process can reach an NTFS-backed worktree where the `.git` directory is also resolvable as `git~1` (or `.git ` / `.git::$DATA`). Gating the guard on the runtime OS skips that class of attack on the very system where it is reachable. Tree-side gates already catch disguised `.git` regardless of this default — `pathutil.ValidTreePath` is always-on per ce4cca1 (the prior commit). This commit closes the parallel gap at the wrapper layer: `worktreeFilesystem.validPath` and `validSymlinkName` now enforce the NTFS rules on non-Windows by default, matching upstream's protect-by-default posture. `PROTECT_HFS_DEFAULT` is left untouched. Upstream chose not to flip that default in 9102f958ee5 (the cost in the cited benchmark was non-trivial and the WSL-equivalent scenario for HFS+ is not realistic), and `defaultProtectHFS` already mirrors that decision via its Darwin-only return. [1]: git/git@9102f958ee5 Backport of #2097. Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <hidde@hhh.computer>
No description provided.