-
Notifications
You must be signed in to change notification settings - Fork 38.7k
Avoid duplicate getheaders requests #6821
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
If new blocks are announced while a node is currently still downloading the headers, it starts to request the same headers multiple times. See bitcoin#6755. With this patch, the client tries to guess whether it is still in the initial headers sync and avoids starting the duplicate requests if it is.
|
Thanks for trying to address this issue; I agree it is suboptimal that we could be in a situation where we're requesting the same headers from the same peer, multiple times. However, I believe that sending a getheaders when a new block is announced is currently the only protection a node has against having chosen a bad peer for its initial headers sync. That is, if the peer a node chooses for initial headers sync never responds to the getheaders message, or never sends headers to get it caught up enough to start sending getheaders messages to other peers, the node will never sync up with the network. So my initial reaction is that this is perhaps not the best approach. I'd be inclined to solve this by trying to keep better per-node state (like time of last getheaders message to try avoiding multiple getheaders messages in flight to the same peer, and if a headers message comes back that doesn't include some announced block and isn't full then we could send another getheaders message, etc). Definitely some edge cases here that need thought... |
|
I thought about doing something like this per-node as well; but in that case, couldn't the peer still start one redundant "getheaders request chain" per peer when it receives inv messages? This still seems like a lot of waste. I understand your point about selecting a bad peer, though. My impression, however, is that this is only a related but otherwise independent issue; no? In that case, wouldn't it make more sense to introduce a "timeout" after which the node chooses a different peer for requesting headers? This would solve the issue at the root (and is somehow similar to the current protection with inv messages, except that it would be more predictable and less hackish). |
|
I don't think there's agreement on this, and development seems to be stuck. @sdaftuar @domob1812 how to proceed here? |
|
I think that this issue should be resolved - while it usually is not a big problem for Bitcoin, it is simply wrong (and could lead to instability) to send duplicate getheaders requests. I think (obviously) that my patch improves the situation, but if there is a better solution, I'm all in favour of it as well. |
|
No objection to solving the underlying issue, but I don't think we should adopt the approach taken in this PR. This is a relatively minor problem (triggered only when your sync peer announces a new block during initial headers synchronization), and I don't think it makes sense to patch up a minor problem in a way that could create new problems. In addition to opening up a node to never syncing up with the network if a bad initial sync peer is chosen as I mentioned above, I think this PR has other issues as well, such as this: Imagine you have a The existing behavior would just send the I think any attempt to solve the issue here needs to be done more carefully, in a way that doesn't introduce new bugs (perhaps by trying an approach along the lines of what I wrote before about keeping more per-peer state). |
|
Closing this PR, then. I agree it would be nice if this is fixed, but seems too minor an issue to need a quick fix that potentially breaks syncing. |
The current logic for syncing headers may lead to lots of duplicate getheaders requests being sent: If a new block arrives while the node is in headers sync, it will send getheaders in response to the block announcement. When the headers arrive, the message will be of maximum size and so a follow-up request will be sent---all of that in addition to the existing headers syncing. This will create a second "chain" of getheaders requests. If more blocks arrive, this may even lead to arbitrarily many parallel chains of redundant requests. This patch changes the behaviour to only request more headers after a maximum-sized message when it contained at least one unknown header. This avoids sustaining parallel chains of redundant requests. Note that this patch avoids the issues raised in the discussion of bitcoin#6821: There is no risk of the node being permanently blocked. At the latest when a new block arrives this will trigger a new getheaders request and restart syncing.
The current logic for syncing headers may lead to lots of duplicate getheaders requests being sent: If a new block arrives while the node is in headers sync, it will send getheaders in response to the block announcement. When the headers arrive, the message will be of maximum size and so a follow-up request will be sent---all of that in addition to the existing headers syncing. This will create a second "chain" of getheaders requests. If more blocks arrive, this may even lead to arbitrarily many parallel chains of redundant requests. This patch changes the behaviour to only request more headers after a maximum-sized message when it contained at least one unknown header. This avoids sustaining parallel chains of redundant requests. Note that this patch avoids the issues raised in the discussion of bitcoin#6821: There is no risk of the node being permanently blocked. At the latest when a new block arrives this will trigger a new getheaders request and restart syncing.
If new blocks are announced while a node is currently still downloading the headers, it starts to request the same headers multiple times. See #6755.
With this patch, the client tries to guess whether it is still in the initial headers sync and avoids starting the duplicate requests if it is. I explained in comments in the code why this is safe. Here's a summary:
In general, I tried to be "conservative" when deciding to prevent a request. The node only thinks it is in the initial headers sync when it last received a
headersmessage with full size (2,000 headers), building on top of its currently known best header, and at least 12 of the headers were previously unknown. If an attacker wants to fake this condition to a node that is actually up-to-date (to prevent it from getting a newly announced block), he or she needs to construct 12 headers with valid PoW; this seems like a lot of effort for a not-so-critical effect.