Skip to content

Conversation

@waketraindev
Copy link
Contributor

@waketraindev waketraindev commented Oct 20, 2025

This PR introduces AddrMan::AddrPolicy, a predicate that allows callers to
exclude addresses during selection. The policy returns true to skip a given
entry and is evaluated while holding AddrMan::cs.

The mechanism is used in CConnman::GetAddressesUnsafe() to filter out
banned and discouraged peers during address selection instead of removing
them afterward. This prevents P2P responses (GETADDR) and RPCs (getnodeaddresses) from
returning fewer results than requested when large portions of the address space are filtered.

Additional improvements:

  • Logs the number of filtered entries for diagnostics
  • Avoids redundant filtering passes in higher-level callers
  • Keeps locking behavior unchanged (the policy runs under AddrMan::cs)

Testing

  • Repro: ban or discourage a large set of addresses, then call
    getnodeaddresses 1, previously underfilled.
  • After this change, the RPC returns up to the requested count as expected.

No behavior change for unaffected callers.

@DrahtBot
Copy link
Contributor

DrahtBot commented Oct 20, 2025

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/33663.

Reviews

See the guideline for information on the review process.

Type Reviewers
Concept ACK frankomosh
Stale ACK mzumsande

If your review is incorrectly listed, please copy-paste <!--meta-tag:bot-skip--> into the comment that the bot should ignore.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #34302 (fuzz: Restore SendMessages coverage in process_message(s) fuzz targets by maflcko)
  • #34162 (net: Avoid undershooting in GetAddressesUnsafe by fjahr)
  • #31650 (refactor: Avoid copies by using const references or by move-construction by maflcko)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

LLM Linter (✨ experimental)

Possible places where named args for integral literals may be used (e.g. func(x, /*named_arg=*/0) in C++, and func(x, named_arg=0) in Python):

  • addrman->GetAddr(/max_addresses=/0, /max_pct=/0, /network=/std::nullopt, /filtered=/false) in src/test/addrman_tests.cpp
  • addrman->GetAddr(/max_addresses=/0, /max_pct=/0, /network=/std::nullopt, /filtered=/false, /policy=/policyNoSkip) in src/test/addrman_tests.cpp
  • addrman->GetAddr(/max_addresses=/0, /max_pct=/0, /network=/std::nullopt, /filtered=/false, /policy=/policySkipAll) in src/test/addrman_tests.cpp
  • addrman->GetAddr(/max_addresses=/0, /max_pct=/0, /network=/std::nullopt, /filtered=/false, /policy=/policyPortPolicy) in src/test/addrman_tests.cpp

2025-12-28

@waketraindev waketraindev marked this pull request as ready for review October 20, 2025 18:17
Copy link
Contributor

@mzumsande mzumsande left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

The main function of this is p2p (answering GetAddr requests from peers), RPCs like getnodeaddresses are of secondary importance.
Since the p2p use is also affected by this, the PR description should be changed to reflect this.

That said, it seems good to always try to return 1000 addresses in a GetAddr answer, so that peers don't know whether we have a long list of banned addresses or not.

@waketraindev waketraindev force-pushed the 2025-10-getaddressesunsafe-underfill branch from c8cb2ea to 70e2204 Compare October 22, 2025 10:55
@waketraindev
Copy link
Contributor Author

  • updated PR description and commit message to highlight P2p getaddr responses
  • corrected formatting

@waketraindev waketraindev force-pushed the 2025-10-getaddressesunsafe-underfill branch from 70e2204 to 934a781 Compare October 28, 2025 06:51
@waketraindev
Copy link
Contributor Author

Added AddrPolicy coverage in addrman_tests

@waketraindev waketraindev force-pushed the 2025-10-getaddressesunsafe-underfill branch from 934a781 to 1a89761 Compare October 28, 2025 07:01
@waketraindev waketraindev requested a review from brunoerg October 28, 2025 07:02
@waketraindev waketraindev force-pushed the 2025-10-getaddressesunsafe-underfill branch from 1a89761 to d7c5cbb Compare October 28, 2025 10:07
@waketraindev waketraindev force-pushed the 2025-10-getaddressesunsafe-underfill branch from d7c5cbb to 4f39b92 Compare October 28, 2025 14:34
@waketraindev
Copy link
Contributor Author

Added test verifying only expected addresses are returned and filtered ones excluded.

@waketraindev waketraindev requested a review from brunoerg October 29, 2025 11:34
@waketraindev waketraindev changed the title addrman, net: filter during address selection via AddrPolicy to avoid underfill addrman, net: Filter during address selection via AddrPolicy to avoid underfill Nov 3, 2025
@waketraindev waketraindev changed the title addrman, net: Filter during address selection via AddrPolicy to avoid underfill net: Filter during address selection via AddrPolicy to avoid underfill Nov 8, 2025
@DrahtBot DrahtBot added the P2P label Nov 8, 2025
@waketraindev waketraindev changed the title net: Filter during address selection via AddrPolicy to avoid underfill net: Filter addrman during address selection via AddrPolicy to avoid underfill Nov 8, 2025
Copy link
Contributor

@frankomosh frankomosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

@waketraindev waketraindev force-pushed the 2025-10-getaddressesunsafe-underfill branch from 4f39b92 to fb3efdb Compare November 17, 2025 11:19
Copy link
Contributor

@mzumsande mzumsande left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review ACK fb3efdb

As far as I can see, the reason for creating and passing an AddrPolicy is that the alternative (having addrman reach into banman to request if an addr is banned) would introduce coupling between the modules, while the current approach only introduces an implied lock order, which is somewhat of a potential footgun - but I don't really see a better solution.
For example, we probably wouldn't want to remove entries from addrman when they are banned, because bans are often temporary.

@DrahtBot DrahtBot requested a review from frankomosh December 22, 2025 23:00
Copy link
Contributor

@fjahr fjahr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While leaving some nit-ish comments I have realized that if we want to add such a generalized concept for filtering, we should implement the current filtering with it as well. E.g. network and filtered params go away and we only have AddrPolicy as param to GetAddr. Then use AddrPolicy to implement what network and filtered were doing. That's a bigger refactor but having this current mix with this redundancy is kind of ugly and I don't really think it's worth it alone.

src/addrman.h Outdated
* @param[in] max_pct Maximum percentage of addresses to return (0 = all). Value must be from 0 to 100.
* @param[in] network Select only addresses of this network (nullopt = all).
* @param[in] filtered Select only addresses that are considered good quality (false = all).
* @param[in] policy Optional predicate to exclude candidates during selection (true = exclude).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, the true = exclude part doesn't have the same meaning as above because this is not a bool parameter. Maybe update this to be more precise. The filtered option doesn't say that it's optional so you might add that while you touch this.

const auto now{Now<NodeSeconds>()};
std::vector<CAddress> addresses;
addresses.reserve(nNodes);
size_t filtered_addr_net = 0, filtered_addr_quality = 0, filtered_addr_policy = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A general problem with the separate tracking like this is that it's order dependent and so the values may actually be somewhat misleading. I.e. if there are some addrs filtered by policy but they are already caught by the net filter they won't be included here. Doing this in a cleaner way may be overengineering but I wanted to mention that this isn't ideal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding to that: the non-optional filter is now in between two optional filters. Feels to me like the optional checks should go after the mandatory but maybe there are arguments for doing it the other way around.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If my suggestion from the top level comment is taken this is kind of obsolete but if not i think rather than trying to count each filter individually I would rather just log the total filtered count and which filters were active. I think that's simpler and least confusing.

}
LogDebug(BCLog::ADDRMAN, "GetAddr returned %d random addresses\n", addresses.size());
LogDebug(BCLog::ADDRMAN, "GetAddr returned %zu random addresses from %s; %zu filtered "
"(%zu network, %zu quality, %zu policy)\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be generally nice if the options that weren't used are also not logged.

@waketraindev
Copy link
Contributor Author

While leaving some nit-ish comments I have realized that if we want to add such a generalized concept for filtering, we should implement the current filtering with it as well. E.g. network and filtered params go away and we only have AddrPolicy as param to GetAddr. Then use AddrPolicy to implement what network and filtered were doing. That's a bigger refactor but having this current mix with this redundancy is kind of ugly and I don't really think it's worth it alone.

Thanks for the suggestion! network and filtered serve distinct, standard purposes (network filters by network, filtered disables quality filtering), while AddrPolicy is specifically for custom filters that skip entries without reducing the returned count. Using AddrPolicy also avoids coupling AddrMan to other modules (like BanMan) for filtering.

@fjahr
Copy link
Contributor

fjahr commented Dec 27, 2025

network and filtered serve distinct, standard purposes (network filters by network, filtered disables quality filtering), while AddrPolicy is specifically for custom filters

What distinguishes a "standard" from a "custom" filter? Is that documented somewhere? And if we need to add another filter that is "standard" we would add another param to the function rather than using AddrPolicy?

that skip entries without reducing the returned count

Huh? If they don't reduce the count how are they filtering anything? This is not what the code is doing, the code is filtering and reducing the returned count with it as well of course. Also demonstrated by the tests. Am I misunderstanding what you mean with count?

Using AddrPolicy also avoids coupling AddrMan to other modules (like BanMan) for filtering.

I don't see how that is an argument against using it for network and quality filters.

@waketraindev
Copy link
Contributor Author

network and filtered serve distinct, standard purposes (network filters by network, filtered disables quality filtering), while AddrPolicy is specifically for custom filters

What distinguishes a "standard" from a "custom" filter? Is that documented somewhere? And if we need to add another filter that is "standard" we would add another param to the function rather than using AddrPolicy?

By "standard" I mean filters that are intrinsic to AddrMan and operate on AddrInfo (for example network classification and quality filtering via ai.IsTerrible(now)), and whose logic does not depend on external subsystems. By "custom" I mean filters that depend on external state or policy decisions outside of AddrMan (for example BanMan), which is what AddrPolicy is intended to support.
If a future filter is intrinsic to AddrMan and depends on AddrInfo, adding it as a separate parameter would be consistent with the existing design. If it depends on external state, AddrPolicy is the appropriate mechanism.

that skip entries without reducing the returned count

Huh? If they don't reduce the count how are they filtering anything? This is not what the code is doing, the code is filtering and reducing the returned count with it as well of course. Also demonstrated by the tests. Am I misunderstanding what you mean with count?

Candidates are filtered and skipped during selection. The change here is when filtering happens. Previously, addresses were selected up to the target count and then banned or discouraged addresses were removed afterward, which caused the final CConnman::GetAddressesUnsafe result to be underfilled. With the current approach, filtering happens during selection, and the loop continues until the requested number of unfiltered addresses is reached or the address space is exhausted. So filtering still occurs, but it no longer causes CConnman::GetAddressesUnsafe to be underfilled relative to the request.

Using AddrPolicy also avoids coupling AddrMan to other modules (like BanMan) for filtering.

I don't see how that is an argument against using it for network and quality filters.

The point is not that AddrPolicy cannot express network or quality filters, but that doing so would conflate concerns and duplicate logic. Network and quality filtering are AddrMan responsibilities and require access to AddrInfo. Expressing them via AddrPolicy would either broaden the policy contract or require reimplementing intrinsic AddrMan logic in call sites, without addressing a concrete issue in this PR. This change keeps the PR scoped to fixing underfilling and avoids introducing additional coupling or refactors.


To keep this PR focused: its purpose is to fix the underfilling issue by moving ban and discourage filtering into the selection phase. The current design achieves that with minimal surface area change and without refactoring existing AddrMan responsibilities or introducing new abstractions. Broader refactors (for example unifying all filters behind AddrPolicy or introducing builders/factories) can be explored separately if there is concrete motivation, but are out of scope for this change. Unless there are functional issues with the current approach, I intend to keep the implementation as is.

Copy link
Contributor

@fjahr fjahr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By "standard" I mean filters that are intrinsic to AddrMan and operate on AddrInfo (for example network classification and quality filtering via ai.IsTerrible(now)), and whose logic does not depend on external subsystems. By "custom" I mean filters that depend on external state or policy decisions outside of AddrMan (for example BanMan), which is what AddrPolicy is intended to support.
If a future filter is intrinsic to AddrMan and depends on AddrInfo, adding it as a separate parameter would be consistent with the existing design. If it depends on external state, AddrPolicy is the appropriate mechanism.

The docs don't mention any of this but instead the naming AddrPolicy and the docs suggest that it's a generalized concept and could be used for any kind of filtering. I still think there is no reason to restrict usage of AddrPolicy in this way when it appears to be possible to just do both.

To keep this PR focused: its purpose is to fix the underfilling issue by moving ban and discourage filtering into the selection phase. The current design achieves that with minimal surface area change and without refactoring existing AddrMan responsibilities or introducing new abstractions.

If preventing underfilling is the only goal of this PR I would suggest a simpler approach: For example request more addresses than max_addresses and then trim at the end. Could be done smartly with using m_banned.size() to make sure we only overshoot as much as necessary. But simplest would be to just request with max_addresses = 0. The current approach doesn't seem minimal and if there is no plan to use it more broadly it seems like overengineering. At least the conceptual limitations and "policy contract" that you are describing would need to be documented clearly in the code as well.

@waketraindev waketraindev force-pushed the 2025-10-getaddressesunsafe-underfill branch from fb3efdb to 33e7b66 Compare December 28, 2025 13:16
@waketraindev
Copy link
Contributor Author

added fuzz coverage, fixed identation, removed hints, ordered includes, cleared nits

CallOneOf(
fuzzed_data_provider,
[&] {
(void)const_addr_man.GetAddr(max_addresses, max_pct, network, filtered);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would you call the same line as above again? This line is completely redundant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oopsie

Add AddrMan::AddrPolicy, a predicate returning true to exclude an address
during selection. This allows callers to apply custom filtering directly in
AddrMan rather than post-processing results.

Use the new policy in CConnman::GetAddressesUnsafe to exclude banned and
discouraged peers while filling address requests. This fixes underfilled
results observed when banned peers were removed after selection, causing
P2P responses from `GETADDR` to return fewer entries than requested.

Also log the number of filtered entries for debugging and diagnostics.
@fjahr
Copy link
Contributor

fjahr commented Dec 28, 2025

It appears you aren't interested in addressing the rest of my comments @waketraindev so I have implemented my suggested approach in #34162 instead. I prefer this approach due to its simplicity, as stated above. It fixes the overshooting problem in 2 lines and adds a functional test case for the behavior change. I will leave it to the other reviewers to decide which approach they prefer.

@Bicaru20
Copy link

I think @fjahr has a good point. If the main goal is simply to always return 1000 addresses in GetAddr, introducing a predicate feels like adding unnecessary complexity. I believe his solution is cleaner.

While leaving some nit-ish comments I have realized that if we want to add such a generalized concept for filtering, we should implement the current filtering with it as well. E.g. network and filtered params go away and we only have AddrPolicy as param to GetAddr. Then use AddrPolicy to implement what network and filtered were doing. That's a bigger refactor but having this current mix with this redundancy is kind of ugly and I don't really think it's worth it alone.

I agree. If a generalized filtering concept is added, it should replace the current filtering mechanism. This would make it easier to add new filters in the future. However, for the problem trying to be solved, I think it’s complicating things too much.

@waketraindev
Copy link
Contributor Author

I think @fjahr has a good point. If the main goal is simply to always return 1000 addresses in GetAddr, introducing a predicate feels like adding unnecessary complexity. I believe his solution is cleaner.

Regarding the "over-engineering" concerns, the lambda(the predicate) here is inline, concise, and used only once.
It encapsulates a simple banned/discouraged filter applied during selection to ensure max_addresses are correctly returned.
This approach is both functionally necessary and efficient, and does not introduce unnecessary complexity, so I do not consider it over-engineering.

On the other hand, introducing a Builder to craft the policy, include other "generalized" filters, and export AddrInfo would be over-engineering in this context.

The #34162 approach is highly inefficient for a database of 70k+ records, with on-demand calls through RPC with a max_addresses limit, and whitelisted nodes. It is insufficient for my use cases.

Though both performance and simplicity are in the eyes of the beholder ig.

Thanks for your feedback!

@fjahr
Copy link
Contributor

fjahr commented Jan 5, 2026

The #34162 approach is highly inefficient for a database of 70k+ records, with on-demand calls through RPC with a max_addresses limit, and whitelisted nodes. It is insufficient for my use cases.

If you could be more explicit with what your use-case is then maybe you could convince more people that your approach is really necessary, even when you refuse to use it more broadly. It comes down to if core wants to support your use case bad enough to take on the extra code to maintain, I guess. You have a lot of addresses, ok! But I don't understand how whitelists play into this. Can you elaborate on that? And why would you do frequent calls to getnodeaddresses with a max_addresses limit, couldn't you just increase the limit?

I did some rough high-level benchmarking using the functional test from my PR (code). Due to the bucket collisions I couldn't get beyond 64k addresses, but I guess this is close enough and I didn't try to get to an even higher number. I am using an arbitrary 1k banned addresses and 1k max limit.

This PR (#33663) has the best performance, not surprising:

2025-12-30T14:34:41.400316Z TestFramework (INFO): Test that getnodeaddresses returns max count even with banned addresses
2025-12-30T14:34:41.770049Z TestFramework (INFO): Adding addresses to addrman...
2025-12-30T14:36:31.483147Z TestFramework (INFO): Successfully added 62636 addresses
2025-12-30T14:36:31.483466Z TestFramework (INFO): Banning 1000 addresses...
2025-12-30T14:36:32.511395Z TestFramework (INFO): Requesting 1000 addresses...
2025-12-30T14:36:32.523990Z TestFramework (INFO): getnodeaddresses returned 1000 addresses in 0.013s

My alternative approach (#34162) is much slower, also not surprising, but at 0.5s it doesn't seem like this would break any workflows I can imagine:

2025-12-30T14:18:12.848562Z TestFramework (INFO): Test that getnodeaddresses returns max count even with banned addresses
2025-12-30T14:18:13.218102Z TestFramework (INFO): Adding addresses to addrman...
2025-12-30T14:19:58.335032Z TestFramework (INFO): Successfully added 63471 addresses
2025-12-30T14:19:58.335895Z TestFramework (INFO): Banning 1000 addresses...
2025-12-30T14:19:59.381759Z TestFramework (INFO): Requesting 1000 addresses...
2025-12-30T14:19:59.940406Z TestFramework (INFO): getnodeaddresses returned 1000 addresses in 0.559s

As I had said previously, my suggest PR could be tweaked to get a middle ground between performance and simplicity, like for example fetch not all but only 2x max limit addresses to get to comparable performance and this should really work in 99% percent of the cases I can imagine. But sure, it's not as clean as #34162 but performance is comparable to this PR ( demo code):

2025-12-30T14:52:23.292663Z TestFramework (INFO): Test that getnodeaddresses returns max count even with banned addresses
2025-12-30T14:52:23.663001Z TestFramework (INFO): Adding addresses to addrman...
2025-12-30T14:54:09.192166Z TestFramework (INFO): Successfully added 63395 addresses
2025-12-30T14:54:09.193381Z TestFramework (INFO): Banning 1000 addresses...
2025-12-30T14:54:10.218400Z TestFramework (INFO): Requesting 1000 addresses...
2025-12-30T14:54:10.238231Z TestFramework (INFO): getnodeaddresses returned 1000 addresses in 0.020s
2025-12-30T14:54:11.333769Z TestFramework (INFO): Stopping nodes

I also learned from playing around with the numbers that the total number of addresses only has minor impact on the performance. The big performance impact is coming from the number of banned addresses. So maybe that's where the discussion on performance (aside from the goals) should focus. I am not sure I have a good grasp on what % of addresses are realistically banned or discouraged. @0xB10C maybe you can weigh in?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants