Skip to content

test: let connections happen in any order in p2p_private_broadcast.py#34410

Merged
achow101 merged 3 commits intobitcoin:masterfrom
vasild:test_private_broadcast_net_specific_connection
Mar 2, 2026
Merged

test: let connections happen in any order in p2p_private_broadcast.py#34410
achow101 merged 3 commits intobitcoin:masterfrom
vasild:test_private_broadcast_net_specific_connection

Conversation

@vasild
Copy link
Contributor

@vasild vasild commented Jan 26, 2026

If the following two events happen:

  • (likely) the automatic 10 initial connections are not made to all
    networks
  • (unlikely) the network-specific logic kicks in almost immediately.
    It is using exponential distribution with a mean of 5 minutes
    (rng.rand_exp_duration(EXTRA_NETWORK_PEER_INTERVAL)).

So if both happen, then the 11th connection may not be the expected
private broadcast, but a network-specific connection.

Fix this by retrieving the connection type from
destinations_factory(). This is more flexible because it allows
connections to happen in any order and does not break if e.g. the 11th
connection is not the expected first private broadcast.

This also makes the test run faster:
before: 19-44 sec
now: 10-25 sec
because for example there is no need to wait for the initial 10
automatic outbound connections to be made in order to proceed.

Fixes: #34387

@DrahtBot DrahtBot added the Tests label Jan 26, 2026
@DrahtBot
Copy link
Contributor

DrahtBot commented Jan 26, 2026

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/34410.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK achow101, andrewtoth, mzumsande
Stale ACK w0xlt, Bicaru20

If your review is incorrectly listed, please copy-paste <!--meta-tag:bot-skip--> into the comment that the bot should ignore.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #34457 (wallet: add private broadcast support for wallet transactions by w0xlt)
  • #33954 (test: add functional test for outbound connection management by mzumsande)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

LLM Linter (✨ experimental)

Possible typos and grammar issues:

  • "to to_addr:to_port" -> "to to_addr:to_port" -> "to connect to to_addr:to_port" [The phrase "attempt to to_addr" has a duplicated "to" and is unclear; "attempt to connect to to_addr:to_port" clarifies the meaning]

Possible places where named args for integral literals may be used (e.g. func(x, /*named_arg=*/0) in C++, and func(x, named_arg=0) in Python):

  • [check_broadcasts("Basic", txs[0], NUM_PRIVATE_BROADCAST_PER_TX - 1, 0)] in test/functional/p2p_private_broadcast.py

2026-02-23 13:00:15

@maflcko maflcko added this to the 31.0 milestone Jan 26, 2026
@fanquake
Copy link
Member

Looks like this causes the test to timeout in most CIs. I see the same failure locally.

@vasild vasild force-pushed the test_private_broadcast_net_specific_connection branch from 0175769 to 38c0c89 Compare January 27, 2026 13:41
@vasild
Copy link
Contributor Author

vasild commented Jan 27, 2026

01757699f3e7a03c32289de4aaee6d04cd2de921...38c0c8900af3a88c9a0c0e79df9fb20511f4bfa2: fixup, my counting was flawed - manual connections count toward network specific connections, but not for automatic connections. I wonder if this can be made better, by deciding on the fly where to redirect each connection...

@vasild vasild force-pushed the test_private_broadcast_net_specific_connection branch from 38c0c89 to 25ee8bd Compare January 30, 2026 13:22
@vasild vasild changed the title test: ensure p2p_private_broadcast.py has connections to all nets test: let connections happen in any order in p2p_private_broadcast.py Jan 30, 2026
@vasild
Copy link
Contributor Author

vasild commented Jan 30, 2026

... I wonder if this can be made better, by deciding on the fly where to redirect each connection...

Yes:

38c0c8900af3a88c9a0c0e79df9fb20511f4bfa2...25ee8bd7f3973ddadbd484a132ffe1f1b5925e57: change the approach to let connections happen in any order, is more robust and faster.

@vasild vasild force-pushed the test_private_broadcast_net_specific_connection branch from 25ee8bd to 4e12c91 Compare January 30, 2026 13:26
@vasild
Copy link
Contributor Author

vasild commented Jan 30, 2026

25ee8bd7f3973ddadbd484a132ffe1f1b5925e57...4e12c91a581d646b15c3c825cbc9cc0a9632399c: pet python linter

@DrahtBot
Copy link
Contributor

🚧 At least one of the CI tasks failed.
Task lint: https://github.com/bitcoin/bitcoin/actions/runs/21517256715/job/61998660210
LLM reason (✨ experimental): CI failed due to lint errors (ruff E711: comparisons to None) in Python code.

Hints

Try to run the tests locally, according to the documentation. However, a CI failure may still
happen due to a number of reasons, for example:

  • Possibly due to a silent merge conflict (the changes in this pull request being
    incompatible with the current code in the target branch). If so, make sure to rebase on the latest
    commit of the target branch.

  • A sanitizer issue, which can only be found by compiling with the sanitizer and running the
    affected test.

  • An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

@vasild vasild force-pushed the test_private_broadcast_net_specific_connection branch from 4e12c91 to e28c3bd Compare February 2, 2026 08:16
@vasild
Copy link
Contributor Author

vasild commented Feb 2, 2026

4e12c91a581d646b15c3c825cbc9cc0a9632399c...e28c3bd45f514974bdf411e48224fe7f1e079080: magic spell to tame the QA gods

@DrahtBot DrahtBot removed the CI failed label Feb 2, 2026
@vasild vasild mentioned this pull request Feb 2, 2026
13 tasks
Copy link

@Bicaru20 Bicaru20 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK. I left some comments to try to improve the code.

@vasild vasild force-pushed the test_private_broadcast_net_specific_connection branch from e28c3bd to fd7afbe Compare February 4, 2026 05:41
@vasild
Copy link
Contributor Author

vasild commented Feb 4, 2026

e28c3bd45f514974bdf411e48224fe7f1e079080...fd7afbe2abaf7fb8bf32450008fd4dfd18967ed8: address suggestions

Copy link
Contributor

@w0xlt w0xlt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK fd7afbe

@DrahtBot DrahtBot requested a review from Bicaru20 February 4, 2026 20:51
@Bicaru20
Copy link

Code review ACK fd7afbe. I tested and reviewed the code and everything looks good.

If the following two events happen:

* (likely) the automatic 10 initial connections are not made to all
  networks
* (unlikely) the network-specific logic kicks in almost immediately.
  It is using exponential distribution with a mean of 5 minutes
  (`rng.rand_exp_duration(EXTRA_NETWORK_PEER_INTERVAL)`).

So if both happen, then the 11th connection may not be the expected
private broadcast, but a network-specific connection.

Fix this by retrieving the connection type from
`destinations_factory()`. This is more flexible because it allows
connections to happen in any order and does not break if e.g. the 11th
connection is not the expected first private broadcast.

This also makes the test run faster:
before: 19-44 sec
now: 10-25 sec
because for example there is no need to wait for the initial 10
automatic outbound connections to be made in order to proceed.

Fixes: bitcoin#34387
I2P addresses must use port=0, otherwise `bitcoind` refuses to connect.

The test `p2p_private_broadcast.py` cannot simulate connections to I2P
peers, so the I2P proxy is set to a dummy `127.0.0.1:1`. Still it is
good to pick I2P addresses and attempt connections to increase coverage.

However the test uses port=8333 for I2P addresses and thus the
connection attempts fail for the "wrong" reason:

```
Error connecting to ...i2p:8333, connection refused due to arbitrary port 8333
```

Using the proper port=0 makes the failures:

```
Error connecting to ...i2p:0: Cannot connect to 127.0.0.1:1
```

which will make possible simulated I2P connections once we have a test
I2P proxy.
@vasild vasild force-pushed the test_private_broadcast_net_specific_connection branch from fd7afbe to da7f70a Compare February 23, 2026 12:59
@vasild
Copy link
Contributor Author

vasild commented Feb 23, 2026

fd7afbe2abaf7fb8bf32450008fd4dfd18967ed8...da7f70a5322843b70f29456a8bc2227209a0718b: make set_tx_returner_and_other() simpler and more robust addressing a comment by @willcl-ark (I can't find that comment here anymore, it vanished?).

@sedited sedited requested review from andrewtoth and removed request for Bicaru20 February 23, 2026 13:02
@vasild
Copy link
Contributor Author

vasild commented Feb 23, 2026

CI failure AssertionError: ... not({} == {'ping': 1, 'tx': 1}) fixed in #34646

@achow101
Copy link
Member

ACK da7f70a

@DrahtBot DrahtBot requested review from Bicaru20 and w0xlt February 23, 2026 21:39
Copy link
Contributor

@andrewtoth andrewtoth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK da7f70a

I must say the connection handling logic is quite complex. Not sure I have a concrete suggestion. Having the first private broadcast connection be "special" and be redirected to node1 and it having its own peer is confusing. Perhaps this case can be split into 2 tests, where all 3 private broadcast connections are "special" and have their own far observer peers and then the other test that just checks broadcast info.

@vasild
Copy link
Contributor Author

vasild commented Feb 24, 2026

... Perhaps this case can be split into 2 tests ...

p2p_private_broadcast.py has become 477 lines. It is good to have coverage and cover edge cases, but at the same time all that better be manageable and the test easy to debug. Maybe at some point it would make sense to split it in smaller independent tests. They are independent right now but all of them are executed one after another and when something in the middle fails one has to judge what happened with all connections and transactions beforehand.

@maflcko
Copy link
Member

maflcko commented Feb 24, 2026

If the test cases are meant to be independent, an alternative could also be to reset the state between each test case. For example, by restarting the nodes, clearing the mempool/wallet/etc

@maflcko
Copy link
Member

maflcko commented Feb 24, 2026

re-run ci

@maflcko maflcko closed this Feb 24, 2026
@maflcko maflcko reopened this Feb 24, 2026
@andrewtoth
Copy link
Contributor

all of them are executed one after another

Partially my fault for adding the rpc tests in there. That could have been split out independently I think.

Copy link
Contributor

@mzumsande mzumsande left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review ACK da7f70a

In general, I don't like for tests to rely on the debug log of the node to make decisions, but I don't see a better way in this case, since the outbound connection logic makes various types of connections on different timers.

Note that instead of a network-specific connection (as explained in the OP), a feeler connection is being made in some of the failed runs linked in the issue, which is also fixed by this.

@achow101 achow101 merged commit 6b0a980 into bitcoin:master Mar 2, 2026
51 of 52 checks passed
@vasild vasild deleted the test_private_broadcast_net_specific_connection branch March 3, 2026 04:59
@vasild
Copy link
Contributor Author

vasild commented Mar 3, 2026

Just an idea: an alternative to checking the debug log for early connection attempts is to change the getpeerinfo RPC to include those attempts. For this, some of the fields which are unknown that early would have to be omitted, e.g. services and added later when the connection is complete.

@fanquake
Copy link
Member

fanquake commented Mar 3, 2026

In general, I don't like for tests to rely on the debug log of the node to make decisions,

@mzumsande yea. Changing production code, to regex debug logs, doesn't seem robust.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

qa: failure in p2p_private_broadcast.py

9 participants