plumbing: format/idxfile, Validate offset64 indices#2083
Merged
Conversation
When a 32-bit offset entry has its MSB set, the lower 31 bits index into the `Offset64` table. Neither `MemoryIndex.getOffset` nor `LazyIndex.offset` validated that the resulting position fell inside the decoded table; an idx with an MSB-set entry whose lower bits exceed the number of decoded 64-bit entries would index past the end of the table. Bounds-check both lookups against the decoded table size and return `ErrMalformedIdxFile` when the index is out of range. `MemoryIndex`'s unexported `getOffset` now returns `(uint64, error)`; the change propagates through `FindOffset`, `genOffsetHash`, and the `Entries` iterator. `LazyIndex` caches the 64-bit-entry count discovered during `init` and uses it for the same check before issuing the `ReadAt`. Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <hidde@hhh.computer>
hiddeco
added a commit
to hiddeco/go-git
that referenced
this pull request
May 7, 2026
When a 32-bit offset entry has its MSB set, the lower 31 bits index into the `Offset64` table. `MemoryIndex.getOffset` did not validate that the resulting position fell inside the decoded table; an idx with an MSB-set entry whose lower bits exceed the number of decoded 64-bit entries would index past the end of the table. Bounds-check the lookup against the decoded table size and return `ErrMalformedIdxFile` when the index is out of range. The unexported `getOffset` now returns `(uint64, error)`; the change propagates through `FindOffset`, `genOffsetHash`, and the `Entries` iterator. Backport of go-git#2083. Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <hidde@hhh.computer>
hiddeco
added a commit
to hiddeco/go-git
that referenced
this pull request
May 7, 2026
The previous bounds check `offset+8 > uint64(len(idx.Offset64))` was correct because masking and the `uint32` source type bound `offset` to ~2^34, leaving uint64 headroom. That safety relies on invariants in surrounding code (mask shape and source type) that aren't visible at the comparison itself. Reformulate as `len < 8 || offset > len-8`, which holds without relying on the upstream bound. Behaviour is unchanged. Backport of go-git#2083. Assisted-by: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Hidde Beydals <hidde@hhh.computer>
`FuzzLazyIndex` covers one of the two parallel idx implementations. The companion `MemoryIndex` (decoder + post-decode `Index` methods + both iterators) had no fuzz coverage, which is why the offset64 bounds gap fixed in e429f8e was not surfaced earlier. `FuzzMemoryIndex` mirrors the structure of `FuzzLazyIndex`: seed the corpus with structurally valid prefixes (using `buildMinimalIdx`), the empty input, and a malformed-but-decodable idx from `buildOOBOffset64Idx`; the fuzz function then runs `Decode` and exercises every `Index` method plus both iterators on inputs that decode successfully. `buildOOBOffset64Idx` lives in `fuzz_helpers.go` (alongside `buildMinimalIdx`) rather than `_test.go`, because the OSS-Fuzz harness extracts fuzz targets into a non-test build context where sibling `_test.go` symbols are not resolved. A 30s local run produced ~2.25M executions and 99 corpus inputs without failure on top of the validated tree. Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <hidde@hhh.computer>
59adfbd to
9a2328a
Compare
The previous bounds check `offset+8 > uint64(len(idx.Offset64))` was correct because masking and the `uint32` source type bound `offset` to ~2^34, leaving uint64 headroom. That safety relies on invariants in surrounding code (mask shape and source type) that aren't visible at the comparison itself. Reformulate as `len < 8 || offset > len-8`, which holds without relying on the upstream bound. Behaviour is unchanged. `LazyIndex.offset` already uses an index-vs-count comparison (`loIndex >= s.count64`) with no addition, so it needs no analogous change. Assisted-by: Claude Opus 4.7 Signed-off-by: Hidde Beydals <hidde@hhh.computer>
9a2328a to
d40332d
Compare
hiddeco
added a commit
to hiddeco/go-git
that referenced
this pull request
May 7, 2026
The previous bounds check `offset+8 > uint64(len(idx.Offset64))` was correct because masking and the `uint32` source type bound `offset` to ~2^34, leaving uint64 headroom. That safety relies on invariants in surrounding code (mask shape and source type) that aren't visible at the comparison itself. Reformulate as `len < 8 || offset > len-8`, which holds without relying on the upstream bound. Behaviour is unchanged. Backport of go-git#2083. Assisted-by: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Hidde Beydals <hidde@hhh.computer>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When a 32-bit offset entry has its MSB set, the lower 31 bits index into the
Offset64table. NeitherMemoryIndex.getOffsetnorLazyIndex.offsetvalidated that the resulting position fell inside the decoded table; an idx with an MSB-set entry whose lower bits exceed the number of decoded 64-bit entries would index past the end of the table.Bounds-check both lookups against the decoded table size and return
ErrMalformedIdxFilewhen the index is out of range.MemoryIndex's unexportedgetOffsetnow returns(uint64, error); the change propagates throughFindOffset,genOffsetHash, and theEntriesiterator.LazyIndexcaches the 64-bit-entry count discovered during init and uses it for the same check before issuing theReadAt.