Pinning content#1509
Conversation
acud
left a comment
There was a problem hiding this comment.
@jmozah a few notes about this PR:
- the delta is huge (61 files so far). we'd have to break this down to smaller units
- there's no test coverage whatsoever for the new functionality
- the necessity of
pinCounter(which is always set to1in the api) is probably arguable pinCounterdefinitely should not beuint8. Imagine that within a certain pin you limit yourself to 255 instances of the same chunk, which is very low (just imagine all of the.DS_Storefiles and other possible examples of just 32 byte repetitions)- i'm not sure i understand why we would need to store information about whether to traverse a manifest as part of the localstore index. it seems like to much of high-level information to store somewhere
i think that a good next step would be to discuss the correct test vectors for this functionality and proceed from there
| // 1) Whether the file is a RAW file or not | ||
| // 2) Size of the pinned file or collection | ||
| // 3) the number of times that particular file or collection is pinned. | ||
| func (p *API) ListPinFiles() (map[string]FileInfo, error) { |
There was a problem hiding this comment.
If this function is used only in tests, maybe not to export it?
There was a problem hiding this comment.
No.. This is exposed outside too. This is the only way through which the node owner will know what files are pinned , their size and their type.
There was a problem hiding this comment.
OK. In general, exposing such listing functionality should be done with some form of pagination, to prevent very large responses. Maybe that can be considered later if needed.
There was a problem hiding this comment.
Is there any other command in swarm done with pagination? Just to see and understand. I agree that we should do a pagination... this will be especially useful for dapps which are pinning and pinning service operators like us. But as you said.. we should take it later.. I want to push this core functionality first.
There was a problem hiding this comment.
I do not think that there is in classical way. But manifests implement a tree that can be considered as a way not to represent all available data at once.
zelig
left a comment
There was a problem hiding this comment.
well done. this PR defo holds the record for most comments ever :)
janos
left a comment
There was a problem hiding this comment.
I thought that I have approved it, but I've only submitted a comment. Sorry, my last comment was intended to be an approval.
|
@jmozah thanks for your infinite patience and endless iterations on this. this is commendable and admirable. and once more thank you for the great work on this PR. it has made great distance and I think that the end-product is great. |
|
@acud Thanks for your persistence and nit picking :-). The end result looks great because of you. |
* 'master' of github.com:ethersphere/swarm: (54 commits) api, chunk, cmd, shed, storage: add support for pinning content (ethersphere#1509) docs/swarm-guide: cleanup (ethersphere#1620) travis: split jobs into different stages (ethersphere#1615) simulation: retry if we hit a collision on tcp/udp ports (ethersphere#1616) api, chunk: rename Tag.New to Tag.Create (ethersphere#1614) pss: instrumentation and refactor (ethersphere#1580) api, cmd, network: add --disable-auto-connect flag (ethersphere#1576) changelog: fix typo (ethersphere#1605) version: update to v0.4.4 unstable (ethersphere#1603) swarm: release v0.4.3 (ethersphere#1602) network/retrieve: add bzz-retrieve protocol (ethersphere#1589) PoC: Network simulation framework (ethersphere#1555) network: structured output for kademlia table (ethersphere#1586) client: add bzz client, update smoke tests (ethersphere#1582) swarm-smoke: fix check max prox hosts for pull/push sync modes (ethersphere#1578) cmd/swarm: allow using a network interface by name for nat purposes (ethersphere#1557) pss: disable TestForwardBasic (ethersphere#1544) api, network: count chunk deliveries per peer (ethersphere#1534) network/newstream: new stream! protocol base implementation (ethersphere#1500) swarm: fix bzz_info.port when using dynamic port allocation (ethersphere#1537) ...
Useful Pinning Links:
ethersphere/user-stories#10
https://swarmresear.ch/t/pinned-content-in-swarm/18
#1274
Pin during upload
Pin after upload
- swarm pin <top level manifest/file hash>
- Increase the pin counter for all the chunks in the merkle tree
Take care of different types of upload
Unpin pinned content
List pinned files
- display all the top level hashes that are pinned
- Also display their pin counters and size
During GC, don't collect pinned chunks