-
Notifications
You must be signed in to change notification settings - Fork 38.7k
Don't query all DNS seeds at once #15558
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsNo conflicts as of last run. |
70d24bd to
9f36b04
Compare
|
Wouldn't this mean if one DNS seed is compromised, it could potentially feed sybil addresses to nodes and its victims wouldn't have any honest nodes (eg, from other DNS seeds)? Does the issue it intends to address actually exist? I would think DNS queries would get cached and proxied by ISP DNS servers sufficiently to deny any information to the real DNS servers, no? |
|
3 seems like an OK number to query for at once, but a reasonable thing to bikeshed a bit. Code looks good to me. |
|
Yeah, my comment in IRC was that one creates an immediate eclipse vulnerability. Three sounds reasonable. We also really should look at why this is triggered as much as it is, locally I see it run on every restart these days. |
|
Concept ACK.
But only for new nodes, I suppose? If it already has gossiped peers before, then 'query one DNS seed' seems more reasonable. But okay, 3 is a safer option in any case.
My guess would be that the peers it tries initially to connect to might not be selected for connectability. Maybe 11 seconds is also too little. I think it would be fine to increase this to say, 30 or even 60 seconds. |
|
It may just be that we need to try connections faster initially. ... I dunno I really have the impression that it was frequently not using dns seeds in the past but is now, but I don't have hard data to support that. In theory feelers should mean that our tried table is better than ever before, but it might just be that there has been a lot of node churn lately. |
|
When the code was added circa 2014 in 2e7009d, 11 seconds was completely arbitrary. It was always the intent to adjust that number based on field experience. It sounds like it needs to increase. The driving theory is to make DNS seeds a fallback and "give it a good try" to avoid querying DNS seeds at all. The code already skips the delay for an empty AddrMan -- ie. fresh bitcoind install -- thus narrowing the user impact of a longer delay to existing nodes that presumably have addrman data to sort through. It boils down to a question of how much dependence should a node have on DNS seeds, in general? 300 seconds is a reasonable answer for the new delay, if we want to give AddrMan time to "warm up" and preserve privacy a bit more. |
IMO nodes should only consult DNS seeds at first run, and then never again. |
|
Concept ACK |
|
Concept ACK with > 1 seed per round. utACK 9f36b04fa039d4ddbd9a4b559a62a5d6e1a873e5 (3 seeds) The frequent DNS queries recently have hopefully been addressed with #15486.
I like that idea as well, but for another PR, as it requires much more thought than this change.
As a GUI use I'd get antsy if "nothing happens" for 15+ seconds, but as long as there's some indicator that we're trying various new peers that should be fine. In that case we can wait much longer before trying the DNS again. |
luke-jr
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK, with a couple of minor suggestions
|
@naumenkogs Did you want to take a look here? This has a number of Concept ACKs, have tagged with |
|
Concept ACK |
|
First of all, could we update the initial message of the PR so that it's clear that it suggests connecting to 3 (not one) without reading full discussion? Maybe strike-through would work. The choice of 3 seeds and the general idea seems reasonable to me. |
Instead, when necessary, query 3. If that leads to a sufficient number of connects, stop. If not, query 3 more, and so on.
9f36b04 to
6170ec5
Compare
|
Addressed the comments above. |
|
ACK 6170ec5 With When I restart it sometimes fetches (3) seeds, about 10 seconds after When I delete |
|
One high-level concern with this is the monoculture of DNS seed software we have, if you select three at random you're almost certain to get three seeds serving from sipa's seeder implementation, whereas we should really be trying to diversify a little bit (not to mention get things like dnssec going, which I dont believe sipa's seeder support). Otherwise, Concept ACK. |
|
@naumenkogs @luke-jr Did you want to re-review here? @sdaftuar This might also interest you? |
|
ACK 6170ec5 |
|
utACK 6170ec5 - I think the risk of a single seeder codebase is orthogonal to this PR. Such risks could also be interpreted differently (diversity could also increase the risk based on the threat model). |
fanquake
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 6170ec5 - Agree with the reasoning behind the change. Did some testing with and without -forcednsseed and/or a peers.dat and monitored the DNS activity.
src/bitcoind -debug=net | grep -i -E 'dns|net|p2p|peers'
2019-09-23T04:36:24Z Loaded 0 banned node ips/subnets from banlist.dat 0ms
2019-09-23T04:36:24Z net: setting try another outbound peer=false
2019-09-23T04:36:29Z init message: Loading P2P addresses...
2019-09-23T04:36:29Z Loaded 6640 addresses from peers.dat 18ms
2019-09-23T04:36:29Z init message: Starting network threads...
2019-09-23T04:36:29Z net thread start
2019-09-23T04:36:29Z dnsseed thread start
2019-09-23T04:36:40Z P2P peers available. Skipped DNS seeding.
2019-09-23T04:36:40Z dnsseed thread exitsrc/bitcoind -debug=net | grep -i -E 'dns|net|p2p|peers'
2019-09-23T04:20:17Z Loaded 0 banned node ips/subnets from banlist.dat 0ms
2019-09-23T04:20:17Z net: setting try another outbound peer=false
2019-09-23T04:20:22Z init message: Loading P2P addresses...
2019-09-23T04:20:22Z Loaded 2863 addresses from peers.dat 7ms
2019-09-23T04:20:22Z init message: Starting network threads...
2019-09-23T04:20:22Z net thread start
2019-09-23T04:20:22Z dnsseed thread start
2019-09-23T04:20:33Z Loading addresses from DNS seed seed.btc.petertodd.org
2019-09-23T04:20:33Z Loading addresses from DNS seed dnsseed.emzy.de
2019-09-23T04:20:33Z Loading addresses from DNS seed dnsseed.bluematt.me
2019-09-23T04:20:44Z P2P peers available. Skipped DNS seeding.
2019-09-23T04:20:44Z dnsseed thread exitsrc/bitcoind -forcednsseed -debug=net | grep -i -E 'dns|net|p2p|peers'
2019-09-23T04:09:52Z Loaded 0 banned node ips/subnets from banlist.dat 0ms
2019-09-23T04:09:52Z net: setting try another outbound peer=false
2019-09-23T04:09:58Z init message: Loading P2P addresses...
2019-09-23T04:09:58Z Loaded 9738 addresses from peers.dat 23ms
2019-09-23T04:09:58Z init message: Starting network threads...
2019-09-23T04:09:58Z net thread start
2019-09-23T04:09:58Z dnsseed thread start
2019-09-23T04:09:58Z Loading addresses from DNS seed seed.bitcoin.sipa.be
2019-09-23T04:09:58Z Loading addresses from DNS seed seed.btc.petertodd.org
2019-09-23T04:09:58Z Loading addresses from DNS seed seed.bitcoinstats.com
2019-09-23T04:09:58Z Loading addresses from DNS seed dnsseed.bitcoin.dashjr.org
2019-09-23T04:09:58Z Loading addresses from DNS seed seed.bitcoin.jonasschnelli.ch
2019-09-23T04:09:58Z Loading addresses from DNS seed seed.bitcoin.sprovoost.nl
2019-09-23T04:09:58Z Loading addresses from DNS seed dnsseed.bluematt.me
2019-09-23T04:09:59Z Loading addresses from DNS seed dnsseed.emzy.de
2019-09-23T04:09:59Z 294 addresses found from DNS seeds
2019-09-23T04:09:59Z dnsseed thread exitsrc/bitcoind -forcednsseed -debug=net | grep -i -E 'dns|net|p2p|peers'
2019-09-23T04:11:34Z Loaded 0 banned node ips/subnets from banlist.dat 0ms
2019-09-23T04:11:34Z net: setting try another outbound peer=false
2019-09-23T04:11:39Z init message: Loading P2P addresses...
2019-09-23T04:11:39Z ERROR: DeserializeFileDB: Failed to open file /Users/michael/Library/Application Support/Bitcoin/peers.dat
2019-09-23T04:11:39Z Invalid or missing peers.dat; recreating
2019-09-23T04:11:39Z Flushed 0 addresses to peers.dat 4ms
2019-09-23T04:11:39Z init message: Starting network threads...
2019-09-23T04:11:39Z dnsseed thread start
2019-09-23T04:11:39Z net thread start
2019-09-23T04:11:39Z Loading addresses from DNS seed seed.bitcoin.sprovoost.nl
2019-09-23T04:11:39Z Loading addresses from DNS seed dnsseed.emzy.de
2019-09-23T04:11:39Z Loading addresses from DNS seed seed.bitcoin.jonasschnelli.ch
2019-09-23T04:11:39Z Loading addresses from DNS seed seed.bitcoin.sipa.be
2019-09-23T04:11:39Z Loading addresses from DNS seed seed.bitcoinstats.com
2019-09-23T04:11:40Z Loading addresses from DNS seed dnsseed.bluematt.me
2019-09-23T04:11:40Z Loading addresses from DNS seed dnsseed.bitcoin.dashjr.org
2019-09-23T04:11:40Z Loading addresses from DNS seed seed.btc.petertodd.org
2019-09-23T04:11:40Z 294 addresses found from DNS seeds
2019-09-23T04:11:40Z dnsseed thread exit| const std::vector<std::string> &vSeeds = Params().DNSSeeds(); | ||
| int found = 0; | ||
| for (const std::string& seed : seeds) { | ||
| // goal: only query DNS seed if address need is acute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
goal: only query DNS seed if address need is acute
From my understanding, "acute" here is essentially on first run, or if you've got a peers.dat with low quality/unlikely to connect quickly peers?
Opened #16938 if anyone wants to brainstorm / discuss DNS seeder software diversity. |
6170ec5 Do not query all DNS seed at once (Pieter Wuille) Pull request description: Before this PR, when we don't have enough connections after 11 seconds, we proceed to query all DNS seeds in a fixed order, loading responses from all of them. Change this to to only query three randomly-selected DNS seed. If 11 seconds later we still don't have enough connections, try again with another one, and so on. This reduces the amount of information DNS seeds can observe about the requesters by spreading the load over all of them. ACKs for top commit: Sjors: ACK 6170ec5 sdaftuar: ACK 6170ec5 jonasschnelli: utACK 6170ec5 - I think the risk of a single seeder codebase is orthogonal to this PR. Such risks could also be interpreted differently (diversity could also increase the risk based on the threat model). fanquake: ACK 6170ec5 - Agree with the reasoning behind the change. Did some testing with and without `-forcednsseed` and/or a `peers.dat` and monitored the DNS activity. Tree-SHA512: 33f6be5f924a85d312303ce272aa8f8d5e04cb616b4b492be98832e3ff37558d13d2b16ede68644ad399aff2bf5ff0ad33844e55eb40b7f8e3fddf9ae43add57
I guess this should say "try again with another three" |
6170ec5 Do not query all DNS seed at once (Pieter Wuille) Pull request description: Before this PR, when we don't have enough connections after 11 seconds, we proceed to query all DNS seeds in a fixed order, loading responses from all of them. Change this to to only query three randomly-selected DNS seed. If 11 seconds later we still don't have enough connections, try again with another one, and so on. This reduces the amount of information DNS seeds can observe about the requesters by spreading the load over all of them. ACKs for top commit: Sjors: ACK 6170ec5 sdaftuar: ACK 6170ec5 jonasschnelli: utACK 6170ec5 - I think the risk of a single seeder codebase is orthogonal to this PR. Such risks could also be interpreted differently (diversity could also increase the risk based on the threat model). fanquake: ACK 6170ec5 - Agree with the reasoning behind the change. Did some testing with and without `-forcednsseed` and/or a `peers.dat` and monitored the DNS activity. Tree-SHA512: 33f6be5f924a85d312303ce272aa8f8d5e04cb616b4b492be98832e3ff37558d13d2b16ede68644ad399aff2bf5ff0ad33844e55eb40b7f8e3fddf9ae43add57
| // Avoiding DNS seeds when we don't need them improves user privacy by | ||
| // creating fewer identifying DNS requests, reduces trust by giving seeds | ||
| // less influence on the network topology, and reduces traffic to the seeds. | ||
| if (addrman.size() > 0 && seeds_right_now == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Post-merge review: I think there's a bug here: if this loop is hit with addrman.size()==0 and seeds_right_now==0 it will then query the first seed, and set seeds_right_now==-1, at which point this condition will keep failing on future loops, preventing a delay between batches of 3 seeds, and preventing early exit. I think this means that new nodes with no peers.dat will still always immediately query all dns seeds. In #16939 I've made the logic if (addrman==0 && seeds_right_now==0) { current code } else if (seeds_right_now == 0) { seeds_right_now += 3; } which should give more sensible behaviour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding was that this was intentional behavior (eg @Sjors points this out in #15558 (comment)), but perhaps this should be better commented.
It made sense to me that we would query all dns seeds when starting up for the first time, which is consistent with prior behavior, and that the purpose of this PR was to limit dns seed usage only in the case of nodes that have been online before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems surprising to me. If it's intentional, could simplify the logic a bit, and make it:
if (forcednsseed || addrman.size() == 0) seeds_right_now = seeds.size();
for (seed : seeds) {
if (seeds_right_now == 0) { sleep(); check(); seeds_right_now += 3; }
lookup();
--seeds_right_now;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aligning with the code comments that are quoted above + @luke-jr 's comments in related PR, this seems like an adequate summary?
- Newly initialized nodes with no addrman data should query seeds immediately
- Nodes with healthy addrman should not query seeds at all
The latter point seems to maximally align with the code comment & user privacy
Summary: > Before this PR, when we don't have enough connections after 11 seconds, we proceed to query all DNS seeds in a fixed order, loading responses from all of them. > > Change this to to only query three randomly-selected DNS seed. If 11 seconds later we still don't have enough connections, try again with another one, and so on. > > This reduces the amount of information DNS seeds can observe about the requesters by spreading the load over all of them. This is a backport of Core [[bitcoin/bitcoin#15558 | PR15558]] Test Plan: `ninja all check-all` Reviewers: O1 Bitcoin ABC, #bitcoin_abc, jasonbcox Reviewed By: O1 Bitcoin ABC, #bitcoin_abc, jasonbcox Subscribers: jasonbcox Differential Revision: https://reviews.bitcoinabc.org/D8057
6170ec5 Do not query all DNS seed at once (Pieter Wuille) Pull request description: Before this PR, when we don't have enough connections after 11 seconds, we proceed to query all DNS seeds in a fixed order, loading responses from all of them. Change this to to only query three randomly-selected DNS seed. If 11 seconds later we still don't have enough connections, try again with another one, and so on. This reduces the amount of information DNS seeds can observe about the requesters by spreading the load over all of them. ACKs for top commit: Sjors: ACK 6170ec5 sdaftuar: ACK 6170ec5 jonasschnelli: utACK 6170ec5 - I think the risk of a single seeder codebase is orthogonal to this PR. Such risks could also be interpreted differently (diversity could also increase the risk based on the threat model). fanquake: ACK 6170ec5 - Agree with the reasoning behind the change. Did some testing with and without `-forcednsseed` and/or a `peers.dat` and monitored the DNS activity. Tree-SHA512: 33f6be5f924a85d312303ce272aa8f8d5e04cb616b4b492be98832e3ff37558d13d2b16ede68644ad399aff2bf5ff0ad33844e55eb40b7f8e3fddf9ae43add57
6170ec5 Do not query all DNS seed at once (Pieter Wuille) Pull request description: Before this PR, when we don't have enough connections after 11 seconds, we proceed to query all DNS seeds in a fixed order, loading responses from all of them. Change this to to only query three randomly-selected DNS seed. If 11 seconds later we still don't have enough connections, try again with another one, and so on. This reduces the amount of information DNS seeds can observe about the requesters by spreading the load over all of them. ACKs for top commit: Sjors: ACK 6170ec5 sdaftuar: ACK 6170ec5 jonasschnelli: utACK 6170ec5 - I think the risk of a single seeder codebase is orthogonal to this PR. Such risks could also be interpreted differently (diversity could also increase the risk based on the threat model). fanquake: ACK 6170ec5 - Agree with the reasoning behind the change. Did some testing with and without `-forcednsseed` and/or a `peers.dat` and monitored the DNS activity. Tree-SHA512: 33f6be5f924a85d312303ce272aa8f8d5e04cb616b4b492be98832e3ff37558d13d2b16ede68644ad399aff2bf5ff0ad33844e55eb40b7f8e3fddf9ae43add57
Before this PR, when we don't have enough connections after 11 seconds, we proceed to query all DNS seeds in a fixed order, loading responses from all of them.
Change this to to only query three randomly-selected DNS seed. If 11 seconds later we still don't have enough connections, try again with another one, and so on.
This reduces the amount of information DNS seeds can observe about the requesters by spreading the load over all of them.