Skip to content

fix: reduce pack verification memory use on Node#2299

Merged
jcubic merged 1 commit into
isomorphic-git:mainfrom
liamhess:liam/pack-memory-usage-5n2
Apr 6, 2026
Merged

fix: reduce pack verification memory use on Node#2299
jcubic merged 1 commit into
isomorphic-git:mainfrom
liamhess:liam/pack-memory-usage-5n2

Conversation

@liamhess

@liamhess liamhess commented Apr 1, 2026

Copy link
Copy Markdown
Contributor

Fixes #2298 by using chunked hashing in node envs.

This would be one idea on how to fix the memory regression for our usage, another idea would be to add a flag in methods like statusMatrix to just 'trust' the repo and then not go down the pack integrity validation code path.

The regression comes from the verification path added in readObjectPacked():

const payload = pack.subarray(0, -20)
await shasum(payload)

That means we load the pack into memory and then hash the entire thing. On very large packs this is much more expensive than it was before 1.37.2, both in runtime and peak RSS.
This PR keeps the same verification behavior, but changes the hashing implementation used for the pack check in Node builds: Node builds use crypto.createHash('sha1') and the payload is fed incrementally in 8 MiB chunks.

What do you think @jcubic ?

Summary by CodeRabbit

  • Performance Improvements

    • Faster, chunked SHA-1 checksum calculation for large payloads in Node.js, improving integrity verification speed and memory usage.
    • Node-targeted builds now use the optimized checksum implementation when running in Node environments.
  • New Features

    • Added a range-capable checksum utility to compute digests over specified slices of binary data.

@coderabbitai

coderabbitai Bot commented Apr 1, 2026

Copy link
Copy Markdown

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7e86035e-b5e6-4dae-a23b-074418711b39

📥 Commits

Reviewing files that changed from the base of the PR and between 8e30814 and e805ed9.

📒 Files selected for processing (4)
  • rollup.config.js
  • src/storage/readObjectPacked.js
  • src/utils/shasumRange.js
  • src/utils/shasumRange.node.js
✅ Files skipped from review due to trivial changes (2)
  • src/utils/shasumRange.js
  • src/utils/shasumRange.node.js
🚧 Files skipped from review as they are similar to previous changes (2)
  • rollup.config.js
  • src/storage/readObjectPacked.js

📝 Walkthrough

Walkthrough

Replaces direct full-buffer SHA-1 hashing with a range-based, chunked hashing utility and wires a Node-specific implementation into the Node Rollup build via an alias; updates packfile integrity check to use the chunked hash over the payload range.

Changes

Cohort / File(s) Summary
Build Configuration
rollup.config.js
Added an alias plugin and registered alias(nodeAliases) for the Node build to map src/utils/shasumRange.jssrc/utils/shasumRange.node.js.
Utils: shasumRange
src/utils/shasumRange.js, src/utils/shasumRange.node.js
Added shasumRange(buffer, {start,end}). Generic version calls shasum on a slice; Node variant hashes the buffer in 8 MiB chunks and returns hex digest.
Storage integration
src/storage/readObjectPacked.js
Replaced shasum(pack.subarray(0, -20)) with shasumRange(pack, { start: 0, end: pack.length - 20 }) for the pack payload integrity check.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested labels

released

Poem

🐰 I hop through bytes in tiny bands,
Chomping chunks with careful hands.
No more bloat, no memory fright,
I munch in eights and sleep at night. ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix: reduce pack verification memory use on Node' clearly describes the main change—implementing a Node-specific memory optimization for pack verification using chunked hashing instead of loading the entire pack into memory.
Linked Issues check ✅ Passed The PR directly addresses issue #2298 by implementing chunked hashing in Node environments, reducing memory usage during pack verification from ~2.49 GB to restored levels, matching the stated objective.
Out of Scope Changes check ✅ Passed All changes are directly scoped to reducing pack verification memory use: Rollup alias plugin for Node builds, new shasumRange utilities, and updated readObjectPacked to use the chunked approach.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
rollup.config.js (1)

50-50: Consider applying the alias to ecmaConfig for Node ES module users.

The alias is only applied to nodeConfig (CJS). Users consuming the ES module build (index.js) on Node.js won't benefit from the chunked hashing—they'll get the generic implementation that doesn't chunk.

If Node ES module usage is common, consider also adding plugins: [alias(nodeAliases)] to ecmaConfig, or documenting that the memory optimization only applies to the CJS entry point.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rollup.config.js` at line 50, The rollup config currently applies
alias(nodeAliases) only to nodeConfig, so the ecmaConfig build doesn't get the
alias and ES module consumers won't get the chunked hashing optimization; update
ecmaConfig to include plugins: [alias(nodeAliases)] (same as nodeConfig) so
aliasing is applied to the ES module entry, and verify the alias import behavior
for ecmaConfig; alternatively, if you prefer not to change builds, add a note in
the documentation clarifying that the memory/chunking optimization only applies
to nodeConfig (CJS).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@rollup.config.js`:
- Line 50: The rollup config currently applies alias(nodeAliases) only to
nodeConfig, so the ecmaConfig build doesn't get the alias and ES module
consumers won't get the chunked hashing optimization; update ecmaConfig to
include plugins: [alias(nodeAliases)] (same as nodeConfig) so aliasing is
applied to the ES module entry, and verify the alias import behavior for
ecmaConfig; alternatively, if you prefer not to change builds, add a note in the
documentation clarifying that the memory/chunking optimization only applies to
nodeConfig (CJS).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ebf0f4e8-3a6e-4ad0-b4c4-509c957c6c25

📥 Commits

Reviewing files that changed from the base of the PR and between ecaf4d4 and 8e30814.

📒 Files selected for processing (4)
  • rollup.config.js
  • src/storage/readObjectPacked.js
  • src/utils/shasumRange.js
  • src/utils/shasumRange.node.js

@jcubic

jcubic commented Apr 1, 2026

Copy link
Copy Markdown
Member

Can you explain exactly what is causing the memory increase and how this solves the issue?

@jcubic

jcubic commented Apr 1, 2026

Copy link
Copy Markdown
Member

I'm not sure what happened, but the tests failed.

Azure_logs.txt

@liamhess

liamhess commented Apr 2, 2026

Copy link
Copy Markdown
Contributor Author

Can you explain exactly what is causing the memory increase and how this solves the issue?

The pack verification code does this:

const payload = pack.subarray(0, -20)
await shasum(payload)

Which keeps the whole pack in memory and hashes it. On large packs that is super expensive .

This PR changes the hashing for node builds where we can use node's crypto to incrementally hash 8 MiB chunks instead of the entire thing at once.

Behavior stays the same but is much more performant.

@liamhess

liamhess commented Apr 2, 2026

Copy link
Copy Markdown
Contributor Author

I'm not sure what happened, but the tests failed.

Azure_logs.txt

Tbh not sure either, 401 in remotes looks unrelated to my changes and maybe an issue with the creds?

FAIL __tests__/test-hosting-providers.js
1712  ● Hosting Providers › GitHub › push
1713
1714    HttpError: HTTP Error: 401 Unauthorized
1715
1716      142 |     if (res.
1717statusCode !== 200) {
1718      143 |       const { response } = await stringifyBody(res)
1719    > 144 |       throw new HttpError(res.statusCode, res.statusMessage, response)
1720          |             ^
1721      145 |     }
1722      146 |     // Git "smart" HTTP servers should respond with the correct Content-Type header.
1723      147 |     if (
1724
1725      at Function.discover (src/managers/GitRemoteHTTP.js:144:13)
1726      at _push (src/commands/push.js:106:22)
1727      at push (src/api/push.js:86:12)
1728      at Object.<anonymous> (__tests__/test-hosting-providers.js:175:19)

@jcubic jcubic force-pushed the liam/pack-memory-usage-5n2 branch from 8e30814 to e805ed9 Compare April 6, 2026 20:46
@jcubic jcubic merged commit 846a9f9 into isomorphic-git:main Apr 6, 2026
4 checks passed
@jcubic

jcubic commented Apr 6, 2026

Copy link
Copy Markdown
Member

🎉 This PR is included in version 1.37.5 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Memory usage regression in v.1.37.2

2 participants