p2p: protect manual evictions by willcl-ark · Pull Request #34743 · bitcoin/bitcoin

willcl-ark · 2026-03-05T13:28:47Z

Closes: #5097

Manual peers (-addnode, -connect + RPC) currently get disconnected by IBD timeout logic for block stalling, block download timeout and headers sync timeout. This can conflict with explicit user intent to keep these peers connected (especially in privacy-focused -connect setups), and causes unnecessary disconnect/reconnect churn.

This PR changes that behavior: manual peers are not disconnected by those timeout paths and we try to re-assign their work to other peers to maintain progress.

Disconnecting manual peers automatically causes reconnect loops and other behavior that can violate user expectation for manually configured peers.

Simply “not disconnecting” (for block stalling) is not sufficient by itself: a manual peer can retain in-flight block assignments and delay overall IBD progress.

I considered taking an approach more like that from #32051, simply protecting peers from some disconnects. However without the reassignment the stalling effects are simply prolonged (which is worse, IMO).

This PR uses a different appraoch:

For manual peers in block-stalling/download-timeout paths:
- release all in-flight blocks from that peer,
- set a one-round recovery flag (m_stall_recovery) so scheduler ordering doesn’t immediately reassign those same blocks back to the same peer.

This attempts to both preserve manual connections and maintain download progress. It actively reassigns stalled work so IBD can continue, rather than just waiting longer. And, the one-round skip prevents immediate re-capture of freed blocks by the same manual peer due to per-peer send ordering.

Compared with simpler “don’t disconnect” approaches, this is more robust both in tests, and real-world slow-peer conditions.

I believe the changes do not increase remote attacker control because it only applies to manual peers (-addnode, -connect, addnode), which are explicitly chosen by the local operator. You were and still are free to add a malicious peer manually.

A malicious manually-added peer can now remain connected longer, but that peer was already granted privileged trust by operator configuration. And in fact, for block stalling/download timeout this PR reduces potential impact of a slow/malicious manual peer by releasing its in-flight blocks so other peers can fetch them.

Of course, if an operator relies only on bad manual peers, sync can still be delayed.

DrahtBot · 2026-03-05T13:29:32Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type	Reviewers
Concept ACK	w0xlt, sedited, kevkevinpal, brunoerg

If your review is incorrectly listed, please copy-paste  into the comment that the bot should ignore.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#34588 (refactor: decompose Peer struct into focused sub-components by w0xlt)
#34565 (refactor: extract BlockDownloadManager from PeerManagerImpl by w0xlt)
#34389 (net/log: standardize peer+addr log formatting via LogPeer by l0rinc)
#33854 (fix assumevalid is ignored during reindex by Eunovo)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

LLM Linter (✨ experimental)

Possible typos and grammar issues:

nodes added using addnode line -> "are not required to be full nodes or support SegWit" [clarifies the intended meaning; "full nodes/support SegWit" is ambiguous/grammatically incorrect — should use "or" between the two requirements]

^{2026-03-11 19:30:05}

w0xlt · 2026-03-05T15:30:16Z

Concept ACK

test/functional/test_framework/test_node.py

jonatack · 2026-03-06T02:51:03Z

Thanks for picking this up, Will.

sedited · 2026-03-06T07:35:00Z

Concept ACK

kevkevinpal · 2026-03-06T15:45:15Z

Concept ACK

brunoerg · 2026-03-09T14:20:56Z

Concept ACK

Manual connections (-addnode, -connect) represent explicit user intent to maintain the connection, and are important for privacy in -connect-only setups. Yet they get disconnected during IBD when the block download window stalls. Instead of disconnecting, release all in-flight blocks from the manual peer so other peers can request them. Set m_stall_recovery to skip one round of block download for this peer. Without this, the message handler's random peer ordering can cause the manual peer's next SendMessages call to re-request the same blocks via FindNextBlocksToDownload before any other peer has a chance to claim them, permanently stalling IBD when mocktime is frozen (and adding unnecessary delay with real time). This flakiness was observed first in functional tests with few peers (intermittent failures), but would also occasionalyl happen with more peers in a non-test scenario.

Add a helper for creating manual (addnode onetry) p2p connections, mirroring the existing add_outbound_p2p_connection method. This will be used by tests that verify manual peer behavior during IBD.

test that a manual (addnode) peer which stalls on a block, is not disconnected when the stalling timeout fires.

Manual connections (-addnode, -connect) represent explicit user intent to maintain the connection, and are important for privacy in -connect-only setups. Yet they get disconnected during IBD when a block download times out. Apply the same release-and-skip strategy as the stalling path: release all in-flight blocks and set m_stall_recovery so other peers get a chance to pick them up.

Test the download timeout path in isolation by using only a manual peer initially (no outbound peers). Without outbound peers, no staller is identified, so only the download timeout fires (after 600s of mocktime). Verify non-disconnection, then add outbound peers and advance time past the stalling timeout to verify IBD completes.

During IBD, a slow headers sync peer gets disconnected if other preferred peers are available. The existing code already protects NoBan peers by resetting their sync state instead of disconnecting. Extend this protection to manual peers, which represent explicit user intent to maintain the connection. Like NoBan peers, reset their sync state so headers can be fetched from another peer without dropping the manual connection.

test that a manually added peer which is used as initial sync peer is not disconnected when the headers download timeout fires.

willcl-ark changed the title ~~Protect manual evictions~~ p2p: protect manual evictions Mar 5, 2026

DrahtBot added the P2P label Mar 5, 2026

kevkevinpal reviewed Mar 5, 2026

View reviewed changes

test/functional/test_framework/test_node.py Outdated Show resolved Hide resolved

willcl-ark force-pushed the protect-manual-evictions branch from 547f119 to 7beac82 Compare March 6, 2026 22:37

DrahtBot added the Needs rebase label Mar 11, 2026

willcl-ark added 8 commits March 11, 2026 19:28

test: add add_manual_p2p_connection helper to TestNode

15422c7

Add a helper for creating manual (addnode onetry) p2p connections, mirroring the existing add_outbound_p2p_connection method. This will be used by tests that verify manual peer behavior during IBD.

test: verify manual peers survive block stalling timeout

9270818

test that a manual (addnode) peer which stalls on a block, is not disconnected when the stalling timeout fires.

test: verify manual peers survive headers sync timeout

2928cad

test that a manually added peer which is used as initial sync peer is not disconnected when the headers download timeout fires.

doc: update addnode rpc help and add release note

accb5e4

willcl-ark force-pushed the protect-manual-evictions branch from 7beac82 to accb5e4 Compare March 11, 2026 19:29

DrahtBot removed the Needs rebase label Mar 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

p2p: protect manual evictions#34743

p2p: protect manual evictions#34743
willcl-ark wants to merge 8 commits intobitcoin:masterfrom
willcl-ark:protect-manual-evictions

willcl-ark commented Mar 5, 2026

Uh oh!

DrahtBot commented Mar 5, 2026 •

edited

Loading

Uh oh!

w0xlt commented Mar 5, 2026

Uh oh!

Uh oh!

jonatack commented Mar 6, 2026

Uh oh!

sedited commented Mar 6, 2026

Uh oh!

kevkevinpal commented Mar 6, 2026

Uh oh!

brunoerg commented Mar 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Conversation

willcl-ark commented Mar 5, 2026

Uh oh!

DrahtBot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews

Conflicts

LLM Linter (✨ experimental)

Uh oh!

w0xlt commented Mar 5, 2026

Uh oh!

Uh oh!

jonatack commented Mar 6, 2026

Uh oh!

sedited commented Mar 6, 2026

Uh oh!

kevkevinpal commented Mar 6, 2026

Uh oh!

brunoerg commented Mar 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

DrahtBot commented Mar 5, 2026 •

edited

Loading