Skip to content

Fix race conditions on disconnects#250

Closed
Sjors wants to merge 6 commits intobitcoin-core:masterfrom
Sjors:2026/03/races
Closed

Fix race conditions on disconnects#250
Sjors wants to merge 6 commits intobitcoin-core:masterfrom
Sjors:2026/03/races

Conversation

@DrahtBot
Copy link

DrahtBot commented Mar 11, 2026

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.
A summary of reviews will appear here.

@Sjors
Copy link
Member Author

Sjors commented Mar 11, 2026

I made some adjustments to the tests in #249:

  • drop the NDEBUG guard
  • (hopefully) fix the sanitizer issue by saving testing_hook_after_cleanup to a local

Opened a fresh PR for a full CI run. @ryanofsky feel free to take the next turn and push to your PR, or I can address remaining issues here.

@Sjors
Copy link
Member Author

Sjors commented Mar 11, 2026

  • dropped # PR number from the (test) commit messages, since those link to this repo and the commit message already has the full link

The LLM came up with a third test, but beyond reproducing the original issue, it doesn't seem very useful. So I omitted it in this PR. See Sjors@7c59dac

@ryanofsky
Copy link
Collaborator

Thanks for updating this! I opened bitcoin/bitcoin#34804 to see how this interacts with bitcoin core CI.

(I wonder if there is a way we automatically run bitcoin core CI on every libmultiprocess PR, or run it more easily. Might be something to look into).

The LLM came up with a third test, but beyond reproducing the original issue, it doesn't seem very useful. So I omitted it in this PR. See Sjors@7c59dac

Didn't look very closely but the test seems clean and I'd be inclined to add it since I already know I want to follow up this PR by duplicating the disconnect tests and maybe reviving my earlier attempt. If the test turns out not to be meaningful or is a maintenance burden we can remove it.

Anyway I plan to review this soon

@maflcko
Copy link
Contributor

maflcko commented Mar 11, 2026

(I wonder if there is a way we automatically run bitcoin core CI on every libmultiprocess PR, or run it more easily. Might be something to look into).

I think you can just copy-paste this yaml:

https://github.com/bitcoin-core/qa-assets/blob/main/.github/workflows/ci.yml#L1

(The qa-assets checkout can be removed)

@Sjors
Copy link
Member Author

Sjors commented Mar 11, 2026

I pushed the third test.

(oh oops, let me check the CI error...)

Add testing_hook_makethread that fires on the worker thread between
the Lock and set_value operations in makeThread. A checker thread uses
try_lock to verify the waiter mutex is held at that point. With the
fix (Lock before set_value) the mutex is held, so try_lock fails.
Without the fix (set_value before Lock) the hook fires before Lock, so
try_lock succeeds -- indicating the race window exists.
@Sjors
Copy link
Member Author

Sjors commented Mar 11, 2026

Fixed missing include.

@Sjors
Copy link
Member Author

Sjors commented Mar 12, 2026

ACK 3f28bca for the fix commits by @ryanofsky.

@fanquake
Copy link
Member

Looks like the subtree update PR that includes this, in our CI, still has heap-use-after-free / data race issues: bitcoin/bitcoin#34804 (comment).

@ryanofsky
Copy link
Collaborator

ryanofsky commented Mar 12, 2026

Looks like the subtree update PR that includes this, in our CI, still has heap-use-after-free / data race issues: bitcoin/bitcoin#34804 (comment).

Commented in the other PR, but those are both caused by this line in newly added test code. I'm currently going through the PR and improving / rewriting the tests now. I will probably reopen #249 with the updates then refresh bitcoin/bitcoin#34804

@Sjors
Copy link
Member Author

Sjors commented Mar 12, 2026

Ok, I'll close this one.

but those are both caused by this line in newly added test code.

The first two commits also had issues in their test in earlier versions. I didn't thoroughly check the third commit because I initially didn't use it.

@Sjors Sjors closed this Mar 12, 2026
@ryanofsky
Copy link
Collaborator

Thanks! I wound up changing a bunch of things in the tests too, but they were extremely helpful to have as a starting point

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants