Skip to content

Conversation

@brunoerg
Copy link
Contributor

@brunoerg brunoerg commented Jan 26, 2024

This PR adds a new flag -cleanupaddrman. When initializing a node with this flag, it will delete addresses from addrman (both new and tried tables) that do not belong to the supported networks (e.g. -onlynet). Deleting addresses that do not belong to the supported networks can avoid a lot of unnecessary addrman Select calls (especially when turning from clearnet + Tor/I2P/CJDNS to Tor/I2P/CJDNS).

  1. This flag is not enabled by default, if user intends to go back to the previous network, just do not set it.
  2. Addresses from non-supported networks will naturely be replaced in addrman since we only store addresses from networks we support (cleaning up them on init is a way to avoid spending resources on it).
  3. Avoiding these addresses in addrman can avoid exceeding the maximum number of tries in ThreadOpenConnections.
// If we didn't find an appropriate destination after trying 100 addresses fetched from addrman,
// stop this loop, and let the outer loop run again (which sleeps, adds seed nodes, recalculates
// already-connected network ranges, ...) before trying new addrman addresses.
nTries++;
if (nTries > 100)
    break;
  1. Specially turning from clearnet + Tor/I2P/CJDNS to Tor/I2P/CJDNS, this feature it will ensure we will relay more addresses from these networks to other peers.

@DrahtBot
Copy link
Contributor

DrahtBot commented Jan 26, 2024

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type Reviewers
Concept ACK 1440000bytes

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #29415 (Broadcast own transactions only via short-lived Tor or I2P connections by vasild)
  • #28960 (kernel: Remove dependency on CScheduler by TheCharlatan)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@DrahtBot DrahtBot added the P2P label Jan 26, 2024
@brunoerg
Copy link
Contributor Author

cc: @mzumsande

@brunoerg brunoerg force-pushed the 2024-01-addrman-delete branch from c4dec94 to cf44d43 Compare January 26, 2024 21:20
@DrahtBot
Copy link
Contributor

🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the
documentation.

Possibly this is due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.

Leave a comment here, if you need help tracking down a confusing failure.

Debug: https://github.com/bitcoin/bitcoin/runs/20916324600

This function deletes addresses from
addrman that are not from the specified
networks.
This flag controls whether the user wants to
delete addresses from addrman that don't belong
to the supported networks in initialization.
@brunoerg brunoerg force-pushed the 2024-01-addrman-delete branch from cf44d43 to 9019237 Compare January 26, 2024 22:27
@naumenkogs
Copy link
Contributor

Having these records around might be useful too, if you go back to those networks.

Do we have any practical problems with these Select calls? Too much CPU time, orz disk reads, or memory use?

Not only this would motivate this code change and review use better, but also it would help us understand under which circumstances a node operator might actually want to do this....And then maybe the right thing would be to trigger this cleaning automatically rather than thinking a user will trigger it.

Or, what i think is even more likely, we may rather change Addrman architecture to facilitate this better in the first place.

@amitiuttarwar
Copy link
Contributor

+1 to the questions asked by @naumenkogs. I agree with the statement in the OP that this change would "avoid a lot of unnecessary addrman Select calls (especially when...", but would like additional context as to how that would tangibly impact the node operations or end user.

in general I am hesitant to add init params since it increases the combinatoric space of startup possibilities & they are hard to deprecate over time. maybe one idea (if we want to support this feature) could be to add a param to -onlynet instead of a whole new startup option.

@1440000bytes

This comment was marked as abuse.

@brunoerg
Copy link
Contributor Author

brunoerg commented Feb 2, 2024

@naumenkogs and @amitiuttarwar I just updated the description with more infos. Thanks!


Or, what i think is even more likely, we may rather change Addrman architecture to facilitate this better in the first place.

@naumenkogs: What do you mean by changing addrman architecture?

@mzumsande
Copy link
Contributor

Deleting addresses that do not belong to the supported networks can avoid a lot of unnecessary addrman Select calls

It would be good to have some intution for how big this effect is. For example, if you have an addrman with mostly IPv4/IPv6 addresses and a few onion ones then restart with -onlynet=onion, will it take noticeably longer to find peers? The worst case scenario would be if you had just a single onion address, but that doesn't seem realistic, so I'd be more interested about the case where you have between 5% and 10% onion addresses.

@vasild
Copy link
Contributor

vasild commented Feb 11, 2024

What about setting preferred_net

std::tie(addr, addr_last_try) = addrman.Select(false, preferred_net);

if -onlynet is used? Surely it is a waste to select an unreachable address:

bitcoin/src/net.cpp

Lines 2695 to 2697 in 7143d43

if (!g_reachable_nets.Contains(addr)) {
continue;
}

That would skew the outcome though. If addrman has 60% IPv4, 25% Tor and 15% I2P addresses right now it is more likely to select Tor over I2P. If we run with -onlynet=tor -onlynet=i2p and set preferred_net to Tor or I2P randomly (50% chance) to avoid the IPv4 addresses, then it will not be more likely to pick Tor. It would be relatively easy to extend that parameter to preferred_nets -- meaning to return an address from one of the given networks.

  1. Specially turning from clearnet + Tor/I2P/CJDNS to Tor/I2P/CJDNS, this feature it will ensure we will relay more addresses from these networks to other peers.

clearnet addresses might be useful to Tor peers if they are multi-homed. I think it is good that we send any address type to any peer. Instead of e.g. only Tor addresses to Tor peers.

@brunoerg
Copy link
Contributor Author

What about setting preferred_net

Yes, I previously discussed this options with @mzumsande. It might be an alternative, sure.

That would skew the outcome though. If addrman has 60% IPv4, 25% Tor and 15% I2P addresses right now it is more likely to select Tor over I2P. If we run with -onlynet=tor -onlynet=i2p and set preferred_net to Tor or I2P randomly (50% chance) to avoid the IPv4 addresses, then it will not be more likely to pick Tor. It would be relatively easy to extend that parameter to preferred_nets -- meaning to return an address from one of the given networks.

Sure, we can extend addrman's Select to deal with more than one network and return an address from one of the given networks.

clearnet addresses might be useful to Tor peers if they are multi-homed. I think it is good that we send any address type to any peer. Instead of e.g. only Tor addresses to Tor peers.

Sure, this is something we can dicuss because not ensuring we're sending Tor addresses to Tor peers is also not positive. But I agree on relaying clearnet addresses for those nodes. I do not have this data, but I suppose that most Tor peers also run on clearnet?

@vasild
Copy link
Contributor

vasild commented Feb 13, 2024

not ensuring we're sending Tor addresses to Tor peers is also not positive

Right, if we sent solely IPv4 addresses to a Tor peer while we have Tor addresses, that seems like something that can be improved.

@brunoerg
Copy link
Contributor Author

I'll close this PR to work on a "less aggressive" approach (extending Select to support multiple networks). I will provide my benchmarks on that new PR.

Thanks, @vasild @mzumsande @naumenkogs @amitiuttarwar.

achow101 added a commit that referenced this pull request Sep 16, 2024
…enConnections`

e4e3b44 net: call `Select` with reachable networks in `ThreadOpenConnections` (brunoerg)
829becd addrman: change `Select` to support multiple networks (brunoerg)
f698636 net: add `All()` in `ReachableNets` (brunoerg)

Pull request description:

  This PR changes addrman's `Select` to support multiple networks and change `ThreadOpenConnections` to call it with reachable networks. It can avoid unnecessary `Select` calls and avoid exceeding the max number of tries (100), especially when turning a clearnet + Tor/I2P/CJDNS node to Tor/I2P/CJDNS. Compared to #29330, this approach is "less aggresive". It does not add a new init flag and does not impact address relay.

  I did an experiment of calling `Select` without passing a network until it finds an address from a network that compose 20% ~ 25% of the addrman (limited to 100 tries).

  ![Screenshot 2024-02-14 at 14 37 58](https://github.com/bitcoin/bitcoin/assets/19480819/7b6863a5-d7a6-40b6-87d5-01667c2de66a)

ACKs for top commit:
  achow101:
    ACK e4e3b44
  vasild:
    ACK e4e3b44
  naumenkogs:
    ACK e4e3b44

Tree-SHA512: e8466b72b85bbc2ad8bfb14471eb27d2c50d4e84218f5ede2c15a6fa3653af61b488cde492dbd398f7502bd847e95bfee1abb7e01092daba2236d3ce3d6d2268
@bitcoin bitcoin locked and limited conversation to collaborators Feb 12, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants