You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Aug 2, 2021. It is now read-only.
I think we should discuss how/when we are going to tackle a refactor of Kademlia. We already have >5 known issues that we want to address, that the current implementation is not supporting:
Kademlia suggests peers when a node retrieves remote chunks without knowledge of how utilised those peers are, resulting in not-optimal recommendations if we have a set of N equally distant peers from a given chunk - Kademlia peers are not utilised adequately during retrieval #1533
Currently a node pull syncs with all peers in a given bin. With push sync we might want to disable syncing of all bins < depth (or not?) and only sync with all our peers within our depth (>= depth). If we still decide to keep pull sync on lower bins (0, 1, etc. < depth), we should definitely not sync with all our peers within bin 0, but only a few. Basically there needs to be a distinction of peers - some peers should be available for retrieve requests, some peers should be available for syncing, and this should be more explicit. Right now in the Kademlia impl. we have a single container with conns, so we should think how we want to design this.
Light nodes - they need to have connections with other peers and have a Kademlia table so that they issue properly retrieve requests, but ideally they should not appear in the Kademlia table of full nodes as we don't want Kademlia to suggest them for syncing, or other caps that they don't have. However it makes sense for Light nodes to share their view of the network with Full nodes, so it seems like there is benefit for them to run partly the hive protocol?
Kademlia connectivity state saving and restoring - should be more deterministic - investigate restarting of networks and traffic incurred #1396 . If we restart our node, we should prefer nodes that we were recently connected to, so that we don't incur syncing costs... (FYI our smoke tests suffer from this if you just restart a deployment and nodes connect to new peers and start historic syncing).
Visibility over Kademlia (some connections and known peers are hidden) and usage of peers can be improved - improve kademlia table output #1403 - currently we don't have a good dashboard ala torrent client, where we can see how many chunks we have sent/received from a peer and how many are in flight. It'd be nice to have this so that we increase throughput of Swarm in general.
Move loading and storing of Kademlia known and connected peers outside of the hive protocol?
I suggest we discuss these soon and decide how and when to tackle them.
I think we should discuss how/when we are going to tackle a refactor of Kademlia. We already have >5 known issues that we want to address, that the current implementation is not supporting:
Kademlia suggests peers when a node retrieves remote chunks without knowledge of how utilised those peers are, resulting in not-optimal recommendations if we have a set of N equally distant peers from a given chunk - Kademlia peers are not utilised adequately during retrieval #1533
The number of connections per bin is not adequate - sometimes it is 2 , sometimes it is 20 - we need to come up with a way to have a more deterministic way to build up a Kademlia table - investigate kademlia connectivity and suggest peer functionality #1436
Currently a node
pull syncs with all peers in a given bin. Withpush syncwe might want to disable syncing of all bins <depth(or not?) and only sync with all our peers within our depth (>=depth). If we still decide to keeppull syncon lower bins (0, 1, etc. <depth), we should definitely not sync with all our peers within bin 0, but only a few. Basically there needs to be a distinction of peers - some peers should be available for retrieve requests, some peers should be available for syncing, and this should be more explicit. Right now in the Kademlia impl. we have a single container withconns, so we should think how we want to design this.Light nodes - they need to have connections with other peers and have a Kademlia table so that they issue properly retrieve requests, but ideally they should not appear in the Kademlia table of
full nodesas we don't want Kademlia to suggest them for syncing, or other caps that they don't have. However it makes sense for Light nodes to share their view of the network with Full nodes, so it seems like there is benefit for them to run partly thehiveprotocol?Kademlia connectivity state saving and restoring - should be more deterministic - investigate restarting of networks and traffic incurred #1396 . If we restart our node, we should prefer nodes that we were recently connected to, so that we don't incur syncing costs... (FYI our smoke tests suffer from this if you just restart a deployment and nodes connect to new peers and start historic syncing).
Visibility over Kademlia (some connections and known peers are hidden) and usage of peers can be improved - improve kademlia table output #1403 - currently we don't have a good dashboard ala torrent client, where we can see how many chunks we have sent/received from a peer and how many are in flight. It'd be nice to have this so that we increase throughput of Swarm in general.
Move loading and storing of Kademlia known and connected peers outside of the
hiveprotocol?I suggest we discuss these soon and decide how and when to tackle them.