Skip to content

test: Add IPC wake-up test and reuse mining context#34727

Open
enirox001 wants to merge 4 commits intobitcoin:masterfrom
enirox001:03-26-ipctest-improv
Open

test: Add IPC wake-up test and reuse mining context#34727
enirox001 wants to merge 4 commits intobitcoin:masterfrom
enirox001:03-26-ipctest-improv

Conversation

@enirox001
Copy link
Contributor

@enirox001 enirox001 commented Mar 3, 2026

This is a follow-up to implement a couple of test improvements discussed in recent IPC PRs and issues.

@DrahtBot DrahtBot added the Tests label Mar 3, 2026
@DrahtBot
Copy link
Contributor

DrahtBot commented Mar 3, 2026

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK Sjors
Stale ACK ryanofsky, w0xlt

If your review is incorrectly listed, please copy-paste <!--meta-tag:bot-skip--> into the comment that the bot should ignore.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #34644 (mining: add submitBlock to IPC Mining interface by w0xlt)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

LLM Linter (✨ experimental)

Possible places where named args for integral literals may be used (e.g. func(x, /*named_arg=*/0) in C++, and func(x, named_arg=0) in Python):

  • [template.submitSolution(ctx, 0, 0, 0, b"")] in test/functional/interface_ipc_mining.py

2026-03-13 10:39:37

@enirox001 enirox001 marked this pull request as draft March 4, 2026 08:54
@enirox001 enirox001 force-pushed the 03-26-ipctest-improv branch 3 times, most recently from c57da9f to 1329a7d Compare March 4, 2026 10:23
@enirox001 enirox001 marked this pull request as ready for review March 4, 2026 11:14
@DrahtBot DrahtBot removed the CI failed label Mar 4, 2026
@sedited sedited requested a review from Sjors March 8, 2026 20:06
@enirox001 enirox001 force-pushed the 03-26-ipctest-improv branch from 6c4c62d to 79d3f77 Compare March 9, 2026 10:34
@enirox001 enirox001 force-pushed the 03-26-ipctest-improv branch 2 times, most recently from ca5fdb5 to 3477c43 Compare March 9, 2026 12:12
@enirox001
Copy link
Contributor Author

enirox001 commented Mar 9, 2026

Added a new test in 3477c43 to verify that submitSolution correctly throws an exception instead of crashing when given an invalid coinbase. This covers the edge case discussed in #33341. Thanks @ryanofsky for the test snippet.

Also updated the PR description to match changes

@DrahtBot DrahtBot removed the CI failed label Mar 9, 2026
Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK 3477c43. Nice changes, and thanks for following up on the previous PRs! I'm a little unsure about the cost/benefit of the new createNewBlock cooldown test and asked some questions, and would like @Sjors to weigh in on. But it seems reasonable and the other test and refactoring changes both look good

@Sjors
Copy link
Member

Sjors commented Mar 10, 2026

Concept ACK

@enirox001 enirox001 force-pushed the 03-26-ipctest-improv branch 2 times, most recently from b46dd5a to daf405d Compare March 11, 2026 04:59
@enirox001 enirox001 force-pushed the 03-26-ipctest-improv branch from daf405d to 730b93a Compare March 11, 2026 05:16
@enirox001
Copy link
Contributor Author

enirox001 commented Mar 11, 2026

Thanks for the reviews, @ryanofsky and @Sjors

I've pushed an update addressing the recent review feedback.

  • Dropped the unconditional sleep and used setmocktime to freeze the node's clock, ensuring createNewBlock wakes up strictly due to the tip advance. Added a comment clarifying that the 10s timeout is just CI padding.
  • Moved the macOS exception workaround into a reusable assert_capnp_failed helper in ipc_util.py, and refactored the existing blockReservedWeight test to use it as well.

results['success'] = res._has("result")
task = asyncio.create_task(create_block_with_timing())
block_hex = self.nodes[1].getblock(node1_block_hash, False)
self.nodes[0].submitblock(block_hex)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

251c9fe: omitting this line does not cause the test to fail.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "test: verify createNewBlock wakes promptly when tip advances" (251c9fe)

re: #34727 (comment)

251c9fe: omitting this line does not cause the test to fail.

Not sure but maybe the test is forgetting to disconnect the nodes, so the block is getting synced even though it is not submitted?

The test also seems off in a few other ways. For one thing, it would seem nice if the test could use the wait_and_do helper. I would think the createNewBlock call go in the async wait_fn and submitblock and setmocktime could go in a do_fn.

The comment about a 10-second timeout as padding doesn't seem right to me, because I would expect that after the block is submitted, there is still an additional cooldown period like 3 seconds or whatever before createNewBlock returns, and the point of the mocktime call would be to advance by 3 seconds so the test does not need to wait.

It also doesn't seem like there is a point to the time.time() calls and the 'duration' value seems to be set and never used. The result dict could also be replaced by using python's nonlocal keyword.

Sorry for all the suggestions, it just seems like there a number of things that could be improved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestions! I've completely refactored this:

  • Applied wait_and_do and nonlocal to clean up the structure.
  • Removed the confusing 10-second padding comment.
  • Regarding mocktime/duration: I tested it, and the internal 3-second wait relies on the real-time system clock, so setmocktime doesn't bypass it. Because the node naturally returns after 3 seconds anyway, I kept the time.time() check to assert duration < 3.0

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review 730b93a. Thanks for applying previous suggestions!

results['success'] = res._has("result")
task = asyncio.create_task(create_block_with_timing())
block_hex = self.nodes[1].getblock(node1_block_hash, False)
self.nodes[0].submitblock(block_hex)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "test: verify createNewBlock wakes promptly when tip advances" (251c9fe)

re: #34727 (comment)

251c9fe: omitting this line does not cause the test to fail.

Not sure but maybe the test is forgetting to disconnect the nodes, so the block is getting synced even though it is not submitted?

The test also seems off in a few other ways. For one thing, it would seem nice if the test could use the wait_and_do helper. I would think the createNewBlock call go in the async wait_fn and submitblock and setmocktime could go in a do_fn.

The comment about a 10-second timeout as padding doesn't seem right to me, because I would expect that after the block is submitted, there is still an additional cooldown period like 3 seconds or whatever before createNewBlock returns, and the point of the mocktime call would be to advance by 3 seconds so the test does not need to wait.

It also doesn't seem like there is a point to the time.time() calls and the 'duration' value seems to be set and never used. The result dict could also be replaced by using python's nonlocal keyword.

Sorry for all the suggestions, it just seems like there a number of things that could be improved

@enirox001
Copy link
Contributor Author

enirox001 commented Mar 12, 2026

Thanks for the detailed review again

I pushed an update addressing the recent review feedback.

Regarding the changes i really wanted to use mocktime here as you both suggested, but I ran into a hurdle when I tested it

I tried advancing setmocktime by 3 seconds inside the do_fn to instantly bypass the wait. However, because the polling mechanism seems to use a real-time system wait. advancing the mocktime didn't seem to wake up the sleeping thread. The node still waits the 3 seconds before naturally returning.

Because of this, a standard timeout wouldn't catch a failure if the submitblock interrupt breaks (it would just pass after 3s). To make the test completely deterministic, I refactored it using wait_and_do and added an assertion to verify the real-world duration is < 3.0s.

If there is a way to making setmocktime trigger the cv wake-up from the test that I missed, please let me know and I will happily update it. Otherwise, the duration check seems to be a feasible way to verify the prompt wake-up.

@enirox001
Copy link
Contributor Author

enirox001 commented Mar 12, 2026

CI failure is unrelated

@DrahtBot DrahtBot requested a review from ryanofsky March 12, 2026 10:33
Copy link
Member

@Sjors Sjors left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 9b6985c

self.nodes[0].submitblock(block_hex)
await wait_and_do(wait_fn(), do_fn)
assert_equal(success, True)
assert duration < 3.0, f"createNewBlock took {duration:.2f}s, did not wake up promptly after tip advances"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3d81147: on my M4 MacBook Pro duration was 0.118s on average, of which 0.1 is from wait_and_do. So we have 3 / (0.118 - 0.1) = 160x margin for a very slow CI machine. So the odds a false positive failure seem acceptably low.

Copy link
Contributor

@w0xlt w0xlt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 9b6985c mod minor nits above

@Sjors
Copy link
Member

Sjors commented Mar 12, 2026

If you feel like adding another followup commit, it would be good to scale waits (waitNext, etc) with the test suite timeout scaling factor. For background see bitcoin-core/libmultiprocess#253 (comment)

@enirox001
Copy link
Contributor Author

If you feel like adding another followup commit, it would be good to scale waits (waitNext, etc) with the test suite timeout scaling factor. For background see bitcoin-core/libmultiprocess#253 (comment)

Sounds good, would work to add a followup commit with these changes

@enirox001 enirox001 force-pushed the 03-26-ipctest-improv branch from 9b6985c to 6865cc4 Compare March 12, 2026 22:36
This adds a complementary test to interface_ipc_mining.py to ensure
that createNewBlock() wakes up immediately once submitblock advances
the tip, rather than needlessly waiting for the cooldown timer to
expire on its own.
The async routines in both interface_ipc.py and interface_ipc_mining.py
contain redundant code to initialize the mining proxy object.

Move the make_mining_ctx helper into test_framework/ipc_util.py and
update both test files to use it. This removes the boilerplate and
prevents code duplication across the IPC test suite.
Add a test case to interface_ipc_mining.py to verify that the IPC
server correctly handles and reports serialization errors rather than
crashing the node.

This covers the scenario where submitSolution is called with data
that cannot be deserialized, as discussed in bitcoin#33341

Also introduces the assert_capnp_failed helper in ipc_util.py to
cleanly handle macOS-specific Cap'n Proto exception strings, and
refactors an existing block weight test to use it.
self.log.debug("interrupt() should abort waitTipChanged()")
async def wait_for_tip():
long_timeout = 60000.0 # 1 minute
long_timeout = 60000.0 * self.options.timeout_factor # 1 minute
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be a bit much, depending on the maximum timeout factor used in CI - it might exceed the test runner max run time. If so, we could e.g. use the max of 1000 * timeout, 60000.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, a massive timeout factor would definitely nuke the CI runner with a really long wait. I've updated this to use max(timeout, 60000.0) so it defaults to a safe 1 minute. I also applied the same max() fix to the timeout * 60 line later in the file to make sure it doesn't cause the same hang.

The IPC mining tests (interface_ipc_mining.py) currently use
hardcoded timeouts (e.g., 1000ms, 60000ms) for operations like
waitTipChanged and waiting for block templates. In heavily
loaded CI environments, such as those running sanitizers with
high parallelism, these hardcoded timeouts can be too short,
leading to spurious test failures and brittleness.

This commit multiplies these timeout variables by the test
suite's global `self.options.timeout_factor`. This ensures that
the IPC wait conditions scale appropriately when the test suite
is run with a higher timeout factor, making the tests robust
against slow execution environments.

Addresses CI brittleness observed in bitcoin-core/libmultiprocess#253.
@enirox001 enirox001 force-pushed the 03-26-ipctest-improv branch from 25e6fdc to 9a71d53 Compare March 13, 2026 10:39
@enirox001
Copy link
Contributor Author

enirox001 commented Mar 13, 2026

Thanks for the feedback and reviews. I've pushed an update addressing the recent reviews:

  • Fixed the indentation in make_mining_ctx and updated the error assertion to use startswith
  • Scaled IPC wait timeouts using self.options.timeout_factor to resolve the CI brittleness mentioned in ci: run some Bitcoin Core CI jobs  bitcoin-core/libmultiprocess#253 (comment)
  • Bounded the extended waits using max(timeout, 60000.0) in both places so large timeout factors don't cause the CI runner to hang and exceed its max time limit

Updated PR description as well

@Sjors
Copy link
Member

Sjors commented Mar 13, 2026

Thanks!

ACK 9a71d53

@DrahtBot DrahtBot requested a review from w0xlt March 13, 2026 11:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants