Skip to content

plumbing: format/idxfile, Validate offset64 indices#2083

Merged
pjbgf merged 3 commits into
go-git:mainfrom
hiddeco:idxfile-validate-offset64
May 7, 2026
Merged

plumbing: format/idxfile, Validate offset64 indices#2083
pjbgf merged 3 commits into
go-git:mainfrom
hiddeco:idxfile-validate-offset64

Conversation

@hiddeco

@hiddeco hiddeco commented May 7, 2026

Copy link
Copy Markdown
Member

When a 32-bit offset entry has its MSB set, the lower 31 bits index into the Offset64 table. Neither MemoryIndex.getOffset nor LazyIndex.offset validated that the resulting position fell inside the decoded table; an idx with an MSB-set entry whose lower bits exceed the number of decoded 64-bit entries would index past the end of the table.

Bounds-check both lookups against the decoded table size and return ErrMalformedIdxFile when the index is out of range. MemoryIndex's unexported getOffset now returns (uint64, error); the change propagates through FindOffset, genOffsetHash, and the Entries iterator. LazyIndex caches the 64-bit-entry count discovered during init and uses it for the same check before issuing the ReadAt.

When a 32-bit offset entry has its MSB set, the lower 31 bits index
into the `Offset64` table. Neither `MemoryIndex.getOffset` nor
`LazyIndex.offset` validated that the resulting position fell inside
the decoded table; an idx with an MSB-set entry whose lower bits
exceed the number of decoded 64-bit entries would index past the
end of the table.

Bounds-check both lookups against the decoded table size and return
`ErrMalformedIdxFile` when the index is out of range. `MemoryIndex`'s
unexported `getOffset` now returns `(uint64, error)`; the change
propagates through `FindOffset`, `genOffsetHash`, and the `Entries`
iterator. `LazyIndex` caches the 64-bit-entry count discovered during
`init` and uses it for the same check before issuing the `ReadAt`.

Assisted-by: Claude Opus 4.7
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
hiddeco added a commit to hiddeco/go-git that referenced this pull request May 7, 2026
When a 32-bit offset entry has its MSB set, the lower 31 bits index
into the `Offset64` table. `MemoryIndex.getOffset` did not validate
that the resulting position fell inside the decoded table; an idx
with an MSB-set entry whose lower bits exceed the number of decoded
64-bit entries would index past the end of the table.

Bounds-check the lookup against the decoded table size and return
`ErrMalformedIdxFile` when the index is out of range. The unexported
`getOffset` now returns `(uint64, error)`; the change propagates
through `FindOffset`, `genOffsetHash`, and the `Entries` iterator.

Backport of go-git#2083.

Assisted-by: Claude Opus 4.7
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
hiddeco added a commit to hiddeco/go-git that referenced this pull request May 7, 2026
The previous bounds check `offset+8 > uint64(len(idx.Offset64))`
was correct because masking and the `uint32` source type bound
`offset` to ~2^34, leaving uint64 headroom. That safety relies on
invariants in surrounding code (mask shape and source type) that
aren't visible at the comparison itself.

Reformulate as `len < 8 || offset > len-8`, which holds without
relying on the upstream bound. Behaviour is unchanged.

Backport of go-git#2083.

Assisted-by: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
`FuzzLazyIndex` covers one of the two parallel idx implementations.
The companion `MemoryIndex` (decoder + post-decode `Index` methods +
both iterators) had no fuzz coverage, which is why the offset64
bounds gap fixed in e429f8e was not surfaced earlier.

`FuzzMemoryIndex` mirrors the structure of `FuzzLazyIndex`: seed the
corpus with structurally valid prefixes (using `buildMinimalIdx`),
the empty input, and a malformed-but-decodable idx from
`buildOOBOffset64Idx`; the fuzz function then runs `Decode` and
exercises every `Index` method plus both iterators on inputs that
decode successfully.

`buildOOBOffset64Idx` lives in `fuzz_helpers.go` (alongside
`buildMinimalIdx`) rather than `_test.go`, because the OSS-Fuzz
harness extracts fuzz targets into a non-test build context where
sibling `_test.go` symbols are not resolved.

A 30s local run produced ~2.25M executions and 99 corpus inputs
without failure on top of the validated tree.

Assisted-by: Claude Opus 4.7
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
@hiddeco hiddeco force-pushed the idxfile-validate-offset64 branch from 59adfbd to 9a2328a Compare May 7, 2026 16:28
The previous bounds check `offset+8 > uint64(len(idx.Offset64))`
was correct because masking and the `uint32` source type bound
`offset` to ~2^34, leaving uint64 headroom. That safety relies on
invariants in surrounding code (mask shape and source type) that
aren't visible at the comparison itself.

Reformulate as `len < 8 || offset > len-8`, which holds without
relying on the upstream bound. Behaviour is unchanged.

`LazyIndex.offset` already uses an index-vs-count comparison
(`loIndex >= s.count64`) with no addition, so it needs no
analogous change.

Assisted-by: Claude Opus 4.7
Signed-off-by: Hidde Beydals <hidde@hhh.computer>
@hiddeco hiddeco force-pushed the idxfile-validate-offset64 branch from 9a2328a to d40332d Compare May 7, 2026 16:31
hiddeco added a commit to hiddeco/go-git that referenced this pull request May 7, 2026
The previous bounds check `offset+8 > uint64(len(idx.Offset64))`
was correct because masking and the `uint32` source type bound
`offset` to ~2^34, leaving uint64 headroom. That safety relies on
invariants in surrounding code (mask shape and source type) that
aren't visible at the comparison itself.

Reformulate as `len < 8 || offset > len-8`, which holds without
relying on the upstream bound. Behaviour is unchanged.

Backport of go-git#2083.

Assisted-by: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Hidde Beydals <hidde@hhh.computer>

@pjbgf pjbgf left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hiddeco thanks for working on this. 🙇

@pjbgf pjbgf merged commit 9a42882 into go-git:main May 7, 2026
17 checks passed
@hiddeco hiddeco deleted the idxfile-validate-offset64 branch May 7, 2026 20:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants