This repository was archived by the owner on Feb 12, 2024. It is now read-only.
fix: Do not load all of a DAG into memory when pinning#2372
Merged
Conversation
Given a `CID`, the `dag._recursiveGet` method returns a list of all descendents of the node with the passed `CID`. This can cause enormous memory useage when importing large datasets. Where this method is invoked the results are either a) disgarded or b) used to calculate the `CID`s of the nodes which is then bad for memory *and* CPU usage. This PR removes the buffering and `CID` recalculating for a nice speedup when adding large datasets. fixes #2310
achingbrain
commented
Aug 20, 2019
Member
|
Holy wow 3s faster than go-ipfs! |
alanshaw
suggested changes
Aug 21, 2019
.aegir.js
Outdated
|
|
||
| module.exports = { | ||
| bundlesize: { maxSize: '689kB' }, | ||
| bundlesize: { maxSize: '756KB' }, |
Member
There was a problem hiding this comment.
I'm not sure this is change is needed, In this branch I'm getting the following:
⨎ npx aegir build --bundlesize
Child
1466 modules
Child
1466 modules
PASS ./dist/index.min.js: 682.3KB < maxSize 756KB (gzip)
Member
Author
There was a problem hiding this comment.
Will revert since we're disabling it anyway
| - stage: check | ||
| script: | ||
| - npx aegir build --bundlesize | ||
| # - npx aegir build --bundlesize |
Member
There was a problem hiding this comment.
Note to self we need a sister PR that re-enables this after it is merged.
src/core/components/pin.js
Outdated
| q.push({ cid }) | ||
| } | ||
|
|
||
| function getIndirectKeys (callback) { |
Member
There was a problem hiding this comment.
You need to pass the preload option from pin.ls here and then pass it to walkDag
Member
Author
For certain datasets under certain conditions. Not cracking out the 🍾 just yet... |
achingbrain
added a commit
that referenced
this pull request
Aug 22, 2019
Port of #2372 into gc branch to ease merging
achingbrain
added a commit
that referenced
this pull request
Aug 22, 2019
Port of #2372 into gc branch to ease merging
achingbrain
added a commit
that referenced
this pull request
Aug 22, 2019
Port of #2372 into gc branch to ease merging
alanshaw
pushed a commit
that referenced
this pull request
Aug 23, 2019
Port of #2372 into gc branch to ease merging
alanshaw
pushed a commit
that referenced
this pull request
Aug 26, 2019
Port of #2372 into gc branch to ease merging
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Given a
CID, thedag. _getRecursivemethod returns a list of all descendents of the node with the passedCID. This can cause enormous memory usage when importing large datasets.Where this method is invoked the results are either a) disgarded or b) used to calculate the
CIDs of the nodes which is then bad for memory and CPU usage.This PR removes the buffering and
CIDrecalculating for a nice speedup when adding large datasets.In my (non-representative, may need all the other unfinished async/iterator stuff) testing, importing folder of 4MB files totalling about 5GB files with content from
/dev/urandominto a fresh repo with a daemon running in the background is now:Which is nice.
fixes #2310