Skip to content

test: add functional test for outbound connection management#33954

Draft
mzumsande wants to merge 3 commits intobitcoin:masterfrom
mzumsande:202511_test_outbound_mgmt
Draft

test: add functional test for outbound connection management#33954
mzumsande wants to merge 3 commits intobitcoin:masterfrom
mzumsande:202511_test_outbound_mgmt

Conversation

@mzumsande
Copy link
Contributor

@mzumsande mzumsande commented Nov 26, 2025

For #29415, vasild added the possibility to have outbound connections made "naturally" by the node using addrman entries, by redirecting them via a SOCKS5 proxy to python p2ps or other full nodes, so that the node under test behaves as if it is connected to normal non-local peers from arbitrary networks, while we have full control over the peers.

I think this feature is really cool and want to suggest to integrate it more prominently into the test framework ("auto_outbound_mode") in the first commit because it allows to test the code in which the node decides to make connections - in contrast to the existing add_outbound_p2p_connection where the functional test intiates the connections.

The second commit uses it to add functional test coverage for the following features of outbound connection management:

This should work well with the test from #29415 (which would keep its own destinations_factory but can call _create_auto_outbound_listener to avoid much of code duplication).

Other possible uses for auto_outbound_mode I could think of are more detailed testing of anchors (feature_anchors.py), feeler connections, or any other p2p logic that uses the network of peers.

@DrahtBot DrahtBot added the Tests label Nov 26, 2025
@DrahtBot
Copy link
Contributor

DrahtBot commented Nov 26, 2025

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/33954.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK vasild

If your review is incorrectly listed, please copy-paste <!--meta-tag:bot-skip--> into the comment that the bot should ignore.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #34457 (wallet: add private broadcast support for wallet transactions by w0xlt)
  • #34410 (test: let connections happen in any order in p2p_private_broadcast.py by vasild)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@mzumsande mzumsande force-pushed the 202511_test_outbound_mgmt branch from 8005a48 to 1158564 Compare November 26, 2025 19:11
@mzumsande mzumsande marked this pull request as ready for review November 26, 2025 20:14
@vasild
Copy link
Contributor

vasild commented Nov 27, 2025

Concept ACK, thanks!

@mzumsande mzumsande marked this pull request as draft November 28, 2025 22:07
@mzumsande
Copy link
Contributor Author

there are still some intermittent timeouts, will put in draft until I've fixed them.

@mzumsande mzumsande force-pushed the 202511_test_outbound_mgmt branch from 1158564 to b961ad4 Compare December 11, 2025 21:00
@mzumsande mzumsande force-pushed the 202511_test_outbound_mgmt branch 3 times, most recently from 53457d8 to 0cb0857 Compare January 2, 2026 00:17
Copy link
Contributor

@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 0cb0857

I ran test_runner.py and p2p_outbound_management.py a bunch of times locally and did not observe timeouts.

Comment on lines +451 to +456
if self.auto_outbound_factory:
actual_to_addr, actual_to_port, listener = self.auto_outbound_factory(i)
else:
# Create listener with default P2PInterface
listener = P2PInterface()
actual_to_addr, actual_to_port = self._create_auto_outbound_listener(listener)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe get auto_outbound_factory() and _create_auto_outbound_listener() to have the same interface? That is, change _create_auto_outbound_listener() to also return the listener? Then this will be more consistent:

                if self.auto_outbound_factory:
                    actual_to_addr, actual_to_port, listener = self.auto_outbound_factory(i)
                else:
                    actual_to_addr, actual_to_port, listener = self._create_auto_outbound_listener(listener)

and then we can have one function (pointer) that either points to the default one or to the user-provided one and here just:

actual_to_addr, actual_to_port, listener = self.auto_outbound_factory(i)

without an if condition. That would mean to also rename _create_auto_outbound_listener() to auto_outbound_factory() and remove the line self.auto_outbound_factory = None from BitcoinTestFramework::__init__().

Feel free to ignore if you do not see this as an improvement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that suggestion - took it!

socks5_config = self._setup_auto_outbound_mode()
for i in range(num_nodes):
extra_args[i] = extra_args[i] + [
f"-proxy={socks5_config.addr[0]}:{socks5_config.addr[1]}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this needs format_addr_port() as well, in case the address happens to be IPv6 and needs to be enclosed in []?

Copy link
Contributor Author

@mzumsande mzumsande Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The config address is hardcoded in _setup_auto_outbound_mode to 127.0.0.1, so IPv6 shouldn't be a concern I think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, only if changed in the future. Or if somebody is playing with this locally and changes it to fir their environment. An address like ::1 would also upset this because it would result in -proxy=::1:12345 which I guess will result in "syntax" error.

Comment on lines +71 to +75
def disconnect_last_block_relay_peer(self):
"""Find and disconnect the most recently connected block-relay-only peer.
Done manually instead of waiting for the node to disconnect automatically to
avoid rare interactions with the scheduling of the next extra block-relay-only peer"""
node = self.nodes[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other helper functions
has_regular_outbounds(),
has_extra_block_relay_peer() and
has_extra_full_outbound_peer()
take a node argument. Maybe add such argument also to disconnect_last_block_relay_peer() for consistency?

PS1 same for wait_for_network_specific_peer()
PS2 all of those end up using self.nodes[0], so another option is to drop the node argument from all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to drop the node arg everywhere.

Comment on lines +92 to +95
self.log.info("Advance time")
self.cur_time += self.MOCKTIME_BUMP
node.setmocktime(self.cur_time)
self.generate(node, 1) # Disable stale-tip extra-outbound connections that could interfer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an unlikely scenario that a stale-tip extra-outbound connection is initiated after advancing the mocktime and before self.generate(node, 1)? And eventually causing the interference that this is trying to disable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think probably not in practice - CheckForStaleTipAndEvictPeers fires every 45 seconds (on a steady clock, not influenced by the mocktime bump). So I think this means it would only fire if 45 real seconds passed since the last mockscheduler call, which seems very unlikely.

Comment on lines +83 to +89
def wait_for_network_specific_peer(self):
"""Advance time, wait for a network-specific full outbound peer, and trigger eviction.

Extra block-relay-only connections are prioritized over network-specific ones, but defensively:
if an already connected tor or cjdns address is picked, there is no retry. This can cause either
type of peer to arrive first.
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that the node already has at least one connection to all reachable networks when this function starts? Would that result in timeouts due to has_extra_full_outbound_peer(node) never returning True?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid this, the test first adds 100 IPv4 addresses to addrman, then waits for the regular otubound connections (setup_peers), and only then the onion and cjdns addresses are added to addrman. So I think it shouldn't be possible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the tor and cjdns connections being established after adding to addrman and before wait_for_network_specific_peer() is called?

This new option allows tests to make use of automatic outbound connections
(made from addrman).
A SOCKS5 proxy is used to redirect the connection, by default to a P2PInterface

For advanced customization, tests can provide auto_outbound_factory()
to overwrite where each outbound connection made by the node should be
redirected to (for example to another node)

Co-authored-by: Vasil Dimov <vd@FreeBSD.org>
@mzumsande mzumsande force-pushed the 202511_test_outbound_mgmt branch from 1562496 to b5e9e38 Compare February 24, 2026 16:17
Using the test framework avoids some duplicated code.
The test covers
- Extra outbound connections when the tip is stale
- Regularly scheduled extra block-relay-only connections
- Management of full outbound connection slots with respect to network-specific peers
@mzumsande mzumsande force-pushed the 202511_test_outbound_mgmt branch from b5e9e38 to c733d11 Compare February 24, 2026 17:25
Copy link
Contributor Author

@mzumsande mzumsande left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review @vasild!
Rebased, addressed feedback, and also added a commit that uses the test-framework auto-outbound mode for p2p_private_broadcast.py, removing some duplicate code.

I just had a rare (?) local intermittent failure of p2p_private_broadcast.py that I need to look into, so I will keep it in draft a little bit longer.

Copy link
Contributor

@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK c733d11

I just had a rare (?) local intermittent failure of p2p_private_broadcast.py

Might be related to #34410 or #34646

Comment on lines +452 to +457
def destinations_factory(requested_to_addr, requested_to_port):
with self.auto_outbound_destinations_lock:
i = len(self.auto_outbound_destinations)

actual_to_addr, actual_to_port, listener = self.auto_outbound_factory(i)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The order in which bitcoind makes connections is a bit undeterministic and also which connection is it going to make is not 100% deterministic either (e.g. is the next connection going to be full-outbound or feeler or net-specific?). #34410 addresses that by not relying only on the connection index but looking up the requested-to address and port in the debug log in order to figure out what the connection is.

That said, auto_outbound_factory() might benefit from getting more information about the connection, in addition to its index, like the requested address and port:

- actual_to_addr, actual_to_port, listener = self.auto_outbound_factory(i)
+ actual_to_addr, actual_to_port, listener = self.auto_outbound_factory(i, requested_to_addr, requested_to_port)

@DrahtBot
Copy link
Contributor

DrahtBot commented Mar 2, 2026

🐙 This pull request conflicts with the target branch and needs rebase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants