-
Notifications
You must be signed in to change notification settings - Fork 110
Improving Latency Measurement #166
Description
#165 outlines latency measurement in Bitswap. The latency of the connection to a peer is maintained per-session, by measuring the time between a request for a CID and receipt of the corresponding block.
There are a few issues with measuring latency in this fashion:
-
Multiple sessions may concurrently request the same block.
For example:- session A requests CID1
<1 second>- session B requests CID1
<100 ms>- the block for CID1 arrives (from the request by session A)
- session A records latency as
<1s + 100ms> - session B records latency as
<100ms>
-
Incoming blocks are processed one at a time.
The latency measurement is adjusted for each block in a message (a message may contain many blocks). -
If a peer doesn't have the block, it simply doesn't respond.
For broadcast requests, we ignore timeouts.
For targeted requests (requests sent to specific peers) we calculate latency based on the timeout. This isn't really accurate, as the peer may not have responded simply because it didn't have the block. -
Ideally we would measure the "bandwidth delay product".
The bandwidth delay product is the<bandwidth of the connection> x <the latency>. It measures the amount of data that can fit in the pipe, and can be used to ensure that the pipe is always as full as possible.
Issues 1 & 2 can be addressed by measuring latency per-peer instead of per-session. This would also likely improve the accuracy of latency measurement as there would be a higher sample size.
Issue 3 can be addressed by either
- changing the protocol such that if a peer doesn't have a CID it sends back a
NOT_FOUNDmessage - ignoring timeouts in latency calculations (this would be much simpler)
Issue 4 is more complex, and needs to consider transport specific nuances, such as TCP slow start. Latency may be a reasonable stand-in for now.