fix: add packfile integrity validation (fixes #2288)#2289
Conversation
Add SHA-1 checksum verification when reading objects from packfiles to prevent accepting corrupted or tampered packfiles. Changes: - Fast check: Verify packfile trailer matches .idx record - Deep check: Calculate SHA-1 of packfile payload - Use subarray() for zero-copy memory efficiency - Use _checksumVerified flag to verify only once per packfile Fixes vulnerability where corrupted packfiles were accepted without detecting checksum mismatches.
📝 WalkthroughWalkthroughAdds packfile integrity verification (trailer quick-check and payload deep-check) to pack reads with a per-pack Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant readObjectPacked as readObjectPacked()
participant Packfile as Packfile(storage)
participant Shasum as shasum
Client->>readObjectPacked: request object by OID
readObjectPacked->>Packfile: open packfile (shared cache)
readObjectPacked->>Packfile: read trailer (last 20 bytes)
Packfile-->>readObjectPacked: trailer bytes
readObjectPacked->>Shasum: compute hex SHA of trailer
Shasum-->>readObjectPacked: trailer SHA
alt trailer SHA matches index
readObjectPacked->>Packfile: read payload (excluding trailer)
readObjectPacked->>Shasum: compute payload SHA
Shasum-->>readObjectPacked: payload SHA
alt payload SHA matches index
readObjectPacked->>Packfile: mark _checksumVerified = true
readObjectPacked->>Packfile: proceed to read object data
Packfile-->>readObjectPacked: object bytes
readObjectPacked-->>Client: return object
else payload mismatch
readObjectPacked-->>Client: throw InternalError("Packfile payload corrupted")
end
else trailer mismatch
readObjectPacked-->>Client: throw InternalError("Packfile trailer mismatch")
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~18 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
src/storage/readObjectPacked.js (1)
48-51:InternalErroris semantically misleading for data-integrity failures.
InternalError's boilerplate message ("If you're not a developer, report the bug to the developers of the application you're using … please file a bug report at isomorphic-git/issues") describes a library defect, not external data corruption or tampering. A user whose packfile is genuinely corrupted would receive a confusing directive to file a bug against isomorphic-git.This follows the existing pattern at Line 26 (
if (p.error) throw new InternalError(p.error)), so changing it is low-priority. Longer term, a dedicatedPackfileIntegrityError(or similar) would allow callers to distinguish a library bug from user-data integrity failures.Also applies to: 59-62
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/storage/readObjectPacked.js` around lines 48 - 51, The code currently throws InternalError for packfile integrity failures which is misleading; define a new PackfileIntegrityError (e.g., class PackfileIntegrityError extends Error) and export it, then replace the two throws of new InternalError(...) in readObjectPacked.js (the pack trailer mismatch throw and the similar throw around lines 59-62) with throws of new PackfileIntegrityError(...) preserving the original message text; also update any local imports/exports so callers can catch PackfileIntegrityError specifically.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/storage/readObjectPacked.js`:
- Around line 36-66: Add unit tests exercising the new integrity verification in
readObjectPacked.js: write tests that (1) build a valid pack buffer so the block
passes and p._checksumVerified becomes true, (2) create a pack buffer with a
modified 20-byte trailer to trigger the fast-check InternalError (packTrailer
mismatch), (3) create a buffer with a modified payload but intact trailer to
trigger the deep-check InternalError (actualPayloadSha mismatch), and (4)
simulate a second read of the same pack to assert p._checksumVerified prevents
re-running shasum (mock or spy shasum to ensure it is not called on the second
call). In each test assert the exact InternalError messages from the thrown
errors (or error type) and reference the symbols p._checksumVerified,
packfileSha, pack.subarray and shasum when locating the code to exercise and to
mock.
- Around line 38-39: Remove the unnecessary null guard around p.packfileSha in
readObjectPacked.js (the check that tests for packfileSha being null before
using it) because GitPackIndex.fromIdx() and GitPackIndex.fromPack() both always
set packfileSha; simply reference p.packfileSha directly in the checksum
verification block (the fast-check/deep-check logic inside readObjectPacked.js)
and drop the defensive branch. Additionally, add unit tests exercising the
checksum verification logic: one test with a valid packfile that passes both
fast and deep checks, one that corrupts the packfile trailer to trigger the
fast-check failure, and one that corrupts the payload to trigger the deep-check
failure, referencing the same pack/index test fixtures used by GitPackIndex to
ensure packfileSha is present.
---
Nitpick comments:
In `@src/storage/readObjectPacked.js`:
- Around line 48-51: The code currently throws InternalError for packfile
integrity failures which is misleading; define a new PackfileIntegrityError
(e.g., class PackfileIntegrityError extends Error) and export it, then replace
the two throws of new InternalError(...) in readObjectPacked.js (the pack
trailer mismatch throw and the similar throw around lines 59-62) with throws of
new PackfileIntegrityError(...) preserving the original message text; also
update any local imports/exports so callers can catch PackfileIntegrityError
specifically.
| // === Packfile Integrity Verification === | ||
| // Performance optimization: use _checksumVerified flag to verify only once per packfile | ||
| if (!p._checksumVerified) { | ||
| const expectedShaFromIndex = p.packfileSha | ||
|
|
||
| // 1. Fast Check: Verify packfile trailer matches index record | ||
| // Use subarray instead of slice to avoid memory copy (zero-copy for large packfiles) | ||
| const packTrailer = pack.subarray(-20) | ||
| const packTrailerSha = Array.from(packTrailer) | ||
| .map(b => b.toString(16).padStart(2, '0')) | ||
| .join('') | ||
| if (packTrailerSha !== expectedShaFromIndex) { | ||
| throw new InternalError( | ||
| `Packfile trailer mismatch: expected ${expectedShaFromIndex}, got ${packTrailerSha}. The packfile may be corrupted.` | ||
| ) | ||
| } | ||
|
|
||
| // 2. Deep Integrity Check: Calculate actual SHA-1 of packfile payload | ||
| // This ensures true data integrity by verifying the entire packfile content | ||
| // Use subarray for zero-copy reading of large files | ||
| const payload = pack.subarray(0, -20) | ||
| const actualPayloadSha = await shasum(payload) | ||
| if (actualPayloadSha !== expectedShaFromIndex) { | ||
| throw new InternalError( | ||
| `Packfile payload corrupted: calculated ${actualPayloadSha} but expected ${expectedShaFromIndex}. The packfile may have been tampered with.` | ||
| ) | ||
| } | ||
|
|
||
| // Mark as verified to prevent performance regression on subsequent reads | ||
| p._checksumVerified = true | ||
| } |
There was a problem hiding this comment.
No automated tests added for the integrity verification logic.
The verification block covers two distinct failure modes (tampered trailer, corrupted payload) and one optimization path (_checksumVerified). These are not exercised by any test file in this PR. For a security-sensitive change, tests that construct:
- A valid packfile → verification passes.
- A packfile with a modified trailer → fast-check failure.
- A packfile with modified payload but intact trailer → deep-check failure.
- A second read of the same pack → verifies
_checksumVerifiedskips re-hashing.
would be strongly recommended before merging.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/storage/readObjectPacked.js` around lines 36 - 66, Add unit tests
exercising the new integrity verification in readObjectPacked.js: write tests
that (1) build a valid pack buffer so the block passes and p._checksumVerified
becomes true, (2) create a pack buffer with a modified 20-byte trailer to
trigger the fast-check InternalError (packTrailer mismatch), (3) create a buffer
with a modified payload but intact trailer to trigger the deep-check
InternalError (actualPayloadSha mismatch), and (4) simulate a second read of the
same pack to assert p._checksumVerified prevents re-running shasum (mock or spy
shasum to ensure it is not called on the second call). In each test assert the
exact InternalError messages from the thrown errors (or error type) and
reference the symbols p._checksumVerified, packfileSha, pack.subarray and shasum
when locating the code to exercise and to mock.
| if (!p._checksumVerified) { | ||
| const expectedShaFromIndex = p.packfileSha |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's find and examine the readPackIndex implementation
find . -name "readPackIndex.js" -o -name "*readPackIndex*" | head -20Repository: isomorphic-git/isomorphic-git
Length of output: 103
🏁 Script executed:
# Check the shasum utility implementation
find . -name "shasum.js" -o -path "*/utils/shasum*" | head -20Repository: isomorphic-git/isomorphic-git
Length of output: 94
🏁 Script executed:
# Search for how packfileSha is assigned throughout the codebase
rg -n "packfileSha\s*=" --type=js -B 2 -A 2Repository: isomorphic-git/isomorphic-git
Length of output: 1523
🏁 Script executed:
# Look for test files related to readObjectPacked
find . -name "*test*" -o -name "*spec*" | xargs grep -l "readObjectPacked\|packfileSha" 2>/dev/null | head -10Repository: isomorphic-git/isomorphic-git
Length of output: 228
🏁 Script executed:
# Examine readPackIndex to understand how packfileSha is populated
cat -n src/storage/readPackIndex.jsRepository: isomorphic-git/isomorphic-git
Length of output: 1122
🏁 Script executed:
# Examine shasum implementation
cat -n src/utils/shasum.jsRepository: isomorphic-git/isomorphic-git
Length of output: 1381
🏁 Script executed:
# Check GitPackIndex.js to understand structure
cat -n src/models/GitPackIndex.jsRepository: isomorphic-git/isomorphic-git
Length of output: 12954
🏁 Script executed:
# Now let's look at the actual readObjectPacked.js file where the issue is
cat -n src/storage/readObjectPacked.jsRepository: isomorphic-git/isomorphic-git
Length of output: 3453
🏁 Script executed:
# Let's also check if there are any tests specifically for readObjectPacked verification
rg -n "readObjectPacked\|_checksumVerified\|packfileSha" __tests__/ -A 3 -B 3Repository: isomorphic-git/isomorphic-git
Length of output: 55
Remove the packfileSha null guard — it's unnecessary; packfileSha is always populated by GitPackIndex.fromIdx() and GitPackIndex.fromPack().
Both code paths in GitPackIndex (lines 84 and 108 of src/models/GitPackIndex.js) unconditionally assign packfileSha before constructing the instance. If the index object reaches line 39 of readObjectPacked.js, the property is guaranteed to exist. The defensive guard is not needed.
Consider adding test coverage for the new checksum verification logic.
The fast-check and deep-check integrity verification (lines 38–66) is security-critical but has no accompanying tests. Add test cases covering:
- Valid packfile that passes both checks
- Trailer corruption detected by fast check
- Payload corruption detected by deep check
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/storage/readObjectPacked.js` around lines 38 - 39, Remove the unnecessary
null guard around p.packfileSha in readObjectPacked.js (the check that tests for
packfileSha being null before using it) because GitPackIndex.fromIdx() and
GitPackIndex.fromPack() both always set packfileSha; simply reference
p.packfileSha directly in the checksum verification block (the
fast-check/deep-check logic inside readObjectPacked.js) and drop the defensive
branch. Additionally, add unit tests exercising the checksum verification logic:
one test with a valid packfile that passes both fast and deep checks, one that
corrupts the packfile trailer to trigger the fast-check failure, and one that
corrupts the payload to trigger the deep-check failure, referencing the same
pack/index test fixtures used by GitPackIndex to ensure packfileSha is present.
|
As CodeRabbit said, please add some unit tests to verify that this works and throw an error when the pack file is not valid. |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
__tests__/test-packfileIntegrity.js (1)
109-136: Test doesn't actually assert single-verification behaviorThe test name and inline comment claim to verify that
_checksumVerifiedprevents re-verification, but the body only confirms that two consecutive reads both succeed and return the sameformat. It passes even if verification runs twice (or zero times). To make this meaningful, the test should inspect the_checksumVerifiedflag on the cached pack entry directly after the first read, or spy on the underlying SHA computation.♻️ Suggested approach to directly assert the flag
// First read - should verify and set the flag const obj1 = await readObjectPacked({ fs, cache, gitdir, oid }) expect(obj1).not.toBeNull() + // Confirm the flag was set after the first read + const packEntries = Object.values(cache) + const packEntry = packEntries.find(p => p._checksumVerified !== undefined) + expect(packEntry).toBeDefined() + expect(packEntry._checksumVerified).toBe(true) // Second read - should use cached verification const obj2 = await readObjectPacked({ fs, cache, gitdir, oid }) expect(obj2).not.toBeNull() expect(obj2.format).toBe(obj1.format)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@__tests__/test-packfileIntegrity.js` around lines 109 - 136, The test currently only checks that two reads succeed but doesn't assert that the pack entry's _checksumVerified flag was set after the first read; update the test to inspect the cache/pack index entry after the first read and assert that its _checksumVerified (or equivalent) property is true (e.g., locate the cached pack entry via the same cache structure used by readObjectPacked and assert entry._checksumVerified === true), then perform the second read to ensure it still returns the same object; alternatively, spy/mock the SHA verification function that readObjectPacked uses and assert it was called only once.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@__tests__/test-packfileIntegrity.js`:
- Around line 43-44: Guard against files.find returning undefined: after
computing packFile via files.find(f => f.endsWith('.pack')), check if packFile
is falsy and throw or assert with a clear message (e.g., "No .pack file found in
packDir") before building packPath or calling fs.read/fs.write; update the same
pattern for the second occurrence around the code using packFile/packPath at the
lines referenced (81–83) so the test fails with a clear setup error instead of
constructing a path with "undefined".
---
Nitpick comments:
In `@__tests__/test-packfileIntegrity.js`:
- Around line 109-136: The test currently only checks that two reads succeed but
doesn't assert that the pack entry's _checksumVerified flag was set after the
first read; update the test to inspect the cache/pack index entry after the
first read and assert that its _checksumVerified (or equivalent) property is
true (e.g., locate the cached pack entry via the same cache structure used by
readObjectPacked and assert entry._checksumVerified === true), then perform the
second read to ensure it still returns the same object; alternatively, spy/mock
the SHA verification function that readObjectPacked uses and assert it was
called only once.
Throw clear error when no .pack file is found instead of producing cryptic 'undefined' path errors. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
There was a problem hiding this comment.
🧹 Nitpick comments (2)
__tests__/test-packfileIntegrity.js (2)
115-142: Test doesn't actually assert the "only once" invariantThe test description claims to verify that
_checksumVerifiedprevents re-verification, but the assertions (obj !== null,obj2.format === obj1.format) would pass even if checksum verification ran on every read. The test only confirms two reads succeed — not that the second read skips the checksum.To make this test meaningful, inspect the cached index entry's flag or spy on
shasum:♻️ Suggested stronger assertion (inspect cache flag)
+ // Reach into the cache to confirm the _checksumVerified flag was set + // readPackIndex caches by filename; find the entry that has the flag + const cacheEntry = Object.values(cache).find( + entry => entry && typeof entry._checksumVerified !== 'undefined' + ) + expect(cacheEntry).toBeDefined() + expect(cacheEntry._checksumVerified).toBe(true)Alternatively, wrap
shasumwith a spy and assert it was called exactly once across both reads.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@__tests__/test-packfileIntegrity.js` around lines 115 - 142, The test for readObjectPacked falsely only checks that two reads succeed instead of asserting that checksum verification ran only once; update the test to either (A) inspect the pack index/cache entry after the first read and assert the `_checksumVerified` flag (or equivalent flag) is set and still set after the second read, referencing the same oid used in readObjectPacked, or (B) spy/mock the hashing function (e.g., shasum) used by readObjectPacked and assert it was invoked exactly once across both reads; modify the test around the two readObjectPacked calls to check the `_checksumVerified` flag on the cached index entry or add a spy on shasum and assert call count equals 1.
55-69: Consider Jest's.rejectsmatcher for cleaner error assertionsThe manual
try/catch+error = eaccumulation pattern works but is more verbose than necessary. Since the same structure repeats in Test 3 (lines 98–112), a small helper or Jest's native matchers would reduce noise.♻️ Suggested alternative using
expect(...).rejects- let error = null - try { - await readObjectPacked({ - fs, - cache, - gitdir, - oid, - }) - } catch (e) { - error = e - } - - expect(error).not.toBeNull() - expect(error instanceof Errors.InternalError).toBe(true) - expect(error.data.message).toContain('Packfile trailer mismatch') + const promise = readObjectPacked({ fs, cache, gitdir, oid }) + await expect(promise).rejects.toBeInstanceOf(Errors.InternalError) + await expect(promise).rejects.toMatchObject({ + data: { message: expect.stringContaining('Packfile trailer mismatch') }, + })Note:
promisewould be evaluated twice (both.rejectscalls), so the secondawait expect(promise)re-uses the same rejected promise — this is safe in JavaScript as a rejectedPromisestays rejected.Apply the same refactor to lines 98–112.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@__tests__/test-packfileIntegrity.js` around lines 55 - 69, Replace the manual try/catch pattern around readObjectPacked({fs, cache, gitdir, oid}) with Jest's promise matchers: call await expect(readObjectPacked({fs, cache, gitdir, oid})).rejects.toBeInstanceOf(Errors.InternalError) and then await expect(readObjectPacked({fs, cache, gitdir, oid})).rejects.toMatchObject({ data: { message: expect.stringContaining('Packfile trailer mismatch') } }); apply the same change to the second occurrence of the try/catch in the file (the block covering the second test). Ensure you import or reference Errors and readObjectPacked the same way they are currently used.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@__tests__/test-packfileIntegrity.js`:
- Around line 115-142: The test for readObjectPacked falsely only checks that
two reads succeed instead of asserting that checksum verification ran only once;
update the test to either (A) inspect the pack index/cache entry after the first
read and assert the `_checksumVerified` flag (or equivalent flag) is set and
still set after the second read, referencing the same oid used in
readObjectPacked, or (B) spy/mock the hashing function (e.g., shasum) used by
readObjectPacked and assert it was invoked exactly once across both reads;
modify the test around the two readObjectPacked calls to check the
`_checksumVerified` flag on the cached index entry or add a spy on shasum and
assert call count equals 1.
- Around line 55-69: Replace the manual try/catch pattern around
readObjectPacked({fs, cache, gitdir, oid}) with Jest's promise matchers: call
await expect(readObjectPacked({fs, cache, gitdir,
oid})).rejects.toBeInstanceOf(Errors.InternalError) and then await
expect(readObjectPacked({fs, cache, gitdir, oid})).rejects.toMatchObject({ data:
{ message: expect.stringContaining('Packfile trailer mismatch') } }); apply the
same change to the second occurrence of the try/catch in the file (the block
covering the second test). Ensure you import or reference Errors and
readObjectPacked the same way they are currently used.
|
🎉 This PR is included in version 1.37.2 🎉 The release is available on: Your semantic-release bot 📦🚀 |
Fixes #2288
Description
Added SHA-1 checksum verification when reading objects from packfiles to prevent accepting corrupted or tampered packfiles. This ensures parity with
git-nativebehavior.Implementation Details:
.idxrecord..subarray()instead of.slice()for memory efficiency to avoid OOM on large packfiles.Array.from().map()for Hex conversion to strictly maintain browser compatibility (noBufferdependency)._checksumVerifiedflag to verify only once per packfile lifecycle.Click to see local verification results (vs git-native)
I'm fixing a bug or typo
npm run add-contributorand follow the prompts to add yourself to the READMESummary by CodeRabbit
New Features
Tests
Chores