Related to: #1396
The way we build our Kademlia table right now is not very well defined - for example in a given deployment we might have a lot peers in bin 0, or very few, for example:
7 connections in bin 0 (network of 20 nodes)
➜ ~ kubectl -n tony exec -ti swarm-private-9 -- /geth --exec="console.log(bzz.hive)" attach /root/.ethereum/bzzd.ipc
Defaulting container name to swarm.
Use 'kubectl describe pod/swarm-private-9 -n tony' to see all of the containers in this pod.
INFO [06-08|20:08:15.464] Bumping default cache on mainnet provided=1024 updated=4096
=========================================================================
commit hash: 53457a1fd
Sat Jun 8 20:08:15 UTC 2019 KΛÐΞMLIΛ hive: queen's address: efb323de97afdaf1d59cff99f239552d0b86864afac696a1d8a765bdbe0850ec
population: 15 (20), NeighbourhoodSize: 2, MinBinSize: 2, MaxBinSize: 4
000 7 2d5f 6006 7408 7090 | 12 2d5f (0) 1d0e (0) 117f (0) 0f53 (0)
001 6 8185 84a6 982f a0e9 | 6 8185 (0) 84a6 (0) 982f (0) a0e9 (0)
============ DEPTH: 2 ==========================================
002 0 | 0
003 1 f4a2 | 1 f4a2 (0)
004 0 | 0
005 0 | 0
006 0 | 0
007 1 eefd | 1 eefd (0)
008 0 | 0
009 0 | 0
010 0 | 0
011 0 | 0
012 0 | 0
013 0 | 0
014 0 | 0
015 0 | 0
=========================================================================
vs
2 connections in bin 0 (same network of 20 nodes)
➜ ~ kubectl -n tony exec -ti swarm-private-10 -- /geth --exec="console.log(bzz.hive)" attach /root/.ethereum/bzzd.ipc
Defaulting container name to swarm.
Use 'kubectl describe pod/swarm-private-10 -n tony' to see all of the containers in this pod.
INFO [06-08|20:09:15.434] Bumping default cache on mainnet provided=1024 updated=4096
=========================================================================
commit hash: 53457a1fd
Sat Jun 8 20:09:15 UTC 2019 KΛÐΞMLIΛ hive: queen's address: eefdfe73b0f47187312c1fdcea5377b7668d6f6cbee33c4e4c344475a53e7edd
population: 8 (20), NeighbourhoodSize: 2, MinBinSize: 2, MaxBinSize: 4
000 2 2d5f 5ab8 | 12 2d5f (0) 1d0e (0) 117f (0) 0f53 (0)
001 4 8185 a0e9 b373 b8ec | 6 b373 (0) b8ec (0) a0e9 (0) 982f (0)
============ DEPTH: 2 ==========================================
002 0 | 0
003 1 f4a2 | 1 f4a2 (0)
004 0 | 0
005 0 | 0
006 0 | 0
007 1 efb3 | 1 efb3 (0)
008 0 | 0
009 0 | 0
010 0 | 0
011 0 | 0
012 0 | 0
013 0 | 0
014 0 | 0
015 0 | 0
=========================================================================
undefined
The number of connections are supposed to be within the range MinBinSize: 2, MaxBinSize: 4, but this doesn't work and we've known that for a while.
This is problematic, because half of the chunks for a given upload would end up in bin 0, so once we trigger a retrieve request, we will be making requests for 1/2 of all chunks to our peers in bin 0. If we want to support larger files in Swarm (100MB ... 100GB), as defined in our active Epics, this won't be feasible with only a small number of peers.
At the same time our current implementation establishes syncing with all peers, meaning that if we have a lot of peers in bin 0, we would be syncing 1/2 of our chunks with all of them - which seems to be a lot of waste... we should make an effort to sync content in Swarm without spamming the whole network.
Because of the issues above, I suggest we review our Kademlia functionality and current implementation and discuss and maybe implement something along the lines of:
- Make sure we have a lot of peers in bin 0, as they are responsible for retrieve requests for half of the chunks on any given download, and we should run fetch requests to all of them. For example, if we are supposed to fetch a 4GB file in a timely fashion, then 1/2 * 4GB == 2GB chunks will be retrieved from bin 0 peers, then we need at least 10 peers for bin 0, in order to deliver adequate user experience.
- Make sure we don't establish syncing with all our peers in a given bin (and bin 0), but only to a small number K, where K is 3-4 (this exact value is TBD).
In general I think if we have N peers in bin 0, then N should be around 1/2 of all our peers, and probably 1/2 of the maxpeers setting. Peers in bins 1..16 should probably be 1/2^(bin_number+1) * maxpeers.
cc @acud @zelig @homotopycolimit
Related to: #1396
The way we build our Kademlia table right now is not very well defined - for example in a given deployment we might have a lot peers in bin 0, or very few, for example:
7 connections in bin 0 (network of 20 nodes)
vs
2 connections in bin 0 (same network of 20 nodes)
The number of connections are supposed to be within the range
MinBinSize: 2, MaxBinSize: 4, but this doesn't work and we've known that for a while.This is problematic, because half of the chunks for a given upload would end up in bin 0, so once we trigger a retrieve request, we will be making requests for 1/2 of all chunks to our peers in bin 0. If we want to support larger files in Swarm (100MB ... 100GB), as defined in our active Epics, this won't be feasible with only a small number of peers.
At the same time our current implementation establishes syncing with all peers, meaning that if we have a lot of peers in bin 0, we would be syncing 1/2 of our chunks with all of them - which seems to be a lot of waste... we should make an effort to sync content in Swarm without spamming the whole network.
Because of the issues above, I suggest we review our Kademlia functionality and current implementation and discuss and maybe implement something along the lines of:
In general I think if we have N peers in bin 0, then N should be around 1/2 of all our peers, and probably 1/2 of the
maxpeerssetting. Peers in bins 1..16 should probably be 1/2^(bin_number+1) * maxpeers.cc @acud @zelig @homotopycolimit