Skip to content

Fix race condition in vote_router::disconnect (...)#5010

Merged
pwojcikdev merged 1 commit intonanocurrency:developfrom
pwojcikdev:fix-vote-router-disconnect
Jan 25, 2026
Merged

Fix race condition in vote_router::disconnect (...)#5010
pwojcikdev merged 1 commit intonanocurrency:developfrom
pwojcikdev:fix-vote-router-disconnect

Conversation

@pwojcikdev
Copy link
Copy Markdown
Contributor

Fixes intermittent test error:

[ RUN      ] active_elections.fork_replacement_tally
Assertion `erased == 1` failed
/Users/runner/work/nano-node/nano-node/nano/node/vote_router.cpp:45 [void nano::vote_router::disconnect(const nano::block_hash &)]'
 0# nano::generate_stacktrace() in /Users/runner/work/nano-node/nano-node/build/core_test
 1# assert_internal(char const*, char const*, char const*, unsigned int, bool, std::__1::basic_string_view<char, std::__1::char_traits<char>>) in /Users/runner/work/nano-node/nano-node/build/core_test
 2# nano::vote_router::disconnect(nano::block_hash const&) in /Users/runner/work/nano-node/nano-node/build/core_test
 3# nano::election::replace_by_weight(std::__1::unique_lock<nano::mutex>&, nano::block_hash const&) in /Users/runner/work/nano-node/nano-node/build/core_test
 4# nano::election::publish(std::__1::shared_ptr<nano::block> const&) in /Users/runner/work/nano-node/nano-node/build/core_test
 5# nano::active_elections::publish(std::__1::shared_ptr<nano::block> const&) in /Users/runner/work/nano-node/nano-node/build/core_test
 6# auto nano::active_elections::active_elections(nano::node&, nano::ledger_notifications&, nano::cementing_set&)::$_1::operator()<std::__1::deque<std::__1::pair<nano::block_status, nano::block_context>, std::__1::allocator<std::__1::pair<nano::block_status, nano::block_context>>>>(std::__1::deque<std::__1::pair<nano::block_status, nano::block_context>, std::__1::allocator<std::__1::pair<nano::block_status, nano::block_context>>> const&) const in /Users/runner/work/nano-node/nano-node/build/core_test
 7# decltype(std::declval<nano::active_elections::active_elections(nano::node&, nano::ledger_notifications&, nano::cementing_set&)::$_1&>()(std::declval<std::__1::deque<std::__1::pair<nano::block_status, nano::block_context>, std::__1::allocator<std::__1::pair<nano::block_status, nano::block_context>>> const&>())) std::__1::__invoke[abi:ue170006]<nano::active_elections::active_elections(nano::node&, nano::ledger_notifications&, nano::cementing_set&)::$_1&, std::__1::deque<std::__1::pair<nano::block_status, nano::block_context>, std::__1::allocator<std::__1::pair<nano::block_status, nano::block_context>>> const&>(nano::active_elections::active_elections(nano::node&, nano::ledger_notifications&, nano::cementing_set&)::$_1&, std::__1::deque<std::__1::pair<nano::block_status, nano::block_context>, std::__1::allocator<std::__1::pair<nano::block_status, nano::block_context>>> const&) in /Users/runner/work/nano-node/nano-node/build/core_test
 8# void std::__1::__invoke_void_return_wrapper<void, true>::__call[abi:ue170006]<nano::active_elections::active_elections(nano::node&, nano::ledger_notifications&, nano::cementing_set&)::$_1&, std::__1::deque<std::__1::pair<nano::block_status, nano::block_context>, std::__1::allocator<std::__1::pair<nano::block_status, nano::block_context>>> const&>(nano::active_elections::active_elections(nano::node&, nano::ledger_notifications&, nano::cementing_set&)::$_1&, std::__1::deque<std::__1::pair<nano::block_status, nano::block_context>, std::__1::allocator<std::__1::pair<nano::block_status, nano::block_context>>> const&) in /Users/runner/work/nano-node/nano-node/build/core_test
 9# std::__1::__function::__alloc_func<nano::active_elections::active_elections(nano::node&, nano::ledger_notifications&, nano::cementing_set&)::$_1, std::__1::allocator<nano::active_elections::active_elections(nano::node&, nano::ledger_notifications&, nano::cementing_set&)::$_1>, void (std::__1::deque<std::__1::pair<nano::block_status, nano::block_context>, std::__1::allocator<std::__1::pair<nano::block_status, nano::block_context>>> const&)>::operator()[abi:ue170006](std::__1::deque<std::__1::pair<nano::block_status, nano::block_context>, std::__1::allocator<std::__1::pair<nano::block_status, nano::block_context>>> const&) in /Users/runner/work/nano-node/nano-node/build/core_test
10# std::__1::__function::__func<nano::active_elections::active_elections(nano::node&, nano::ledger_notifications&, nano::cementing_set&)::$_1, std::__1::allocator<nano::active_elections::active_elections(nano::node&, nano::ledger_notifications&, nano::cementing_set&)::$_1>, void (std::__1::deque<std::__1::pair<nano::block_status, nano::block_context>, std::__1::allocator<std::__1::pair<nano::block_status, nano::block_context>>> const&)>::operator()(std::__1::deque<std::__1::pair<nano::block_status, nano::block_context>, std::__1::allocator<std::__1::pair<nano::block_status, nano::block_context>>> const&) in /Users/runner/work/nano-node/nano-node/build/core_test
11# std::__1::__function::__value_func<void (std::__1::deque<std::__1::pair<nano::block_status, nano::block_context>, std::__1::allocator<std::__1::pair<nano::block_status, nano::block_context>>> const&)>::operator()[abi:ue170006](std::__1::deque<std::__1::pair<nano::block_status, nano::block_context>, std::__1::allocator<std::__1::pair<nano::block_status, nano::block_context>>> const&) const in /Users/runner/work/nano-node/nano-node/build/core_test
12# std::__1::function<void (std::__1::deque<std::__1::pair<nano::block_status, nano::block_context>, std::__1::allocator<std::__1::pair<nano::block_status, nano::block_context>>> const&)>::operator()(std::__1::deque<std::__1::pair<nano::block_status, nano::block_context>, std::__1::allocator<std::__1::pair<nano::block_status, nano::block_context>>> const&) const in /Users/runner/work/nano-node/nano-node/build/core_test
13# nano::observer_set<std::__1::deque<std::__1::pair<nano::block_status, nano::block_context>, std::__1::allocator<std::__1::pair<nano::block_status, nano::block_context>>>>::notify(std::__1::deque<std::__1::pair<nano::block_status, nano::block_context>, std::__1::allocator<std::__1::pair<nano::block_status, nano::block_context>>> const&) const in /Users/runner/work/nano-node/nano-node/build/core_test
14# nano::ledger_notifications::notify_processed(nano::secure::write_transaction&, std::__1::deque<std::__1::pair<nano::block_status, nano::block_context>, std::__1::allocator<std::__1::pair<nano::block_status, nano::block_context>>>, std::__1::function<void ()>)::$_2::operator()() in /Users/runner/work/nano-node/nano-node/build/core_test
15# nano::ledger_notifications::notify_processed(nano::secure::write_transaction&, std::__1::deque<std::__1::pair<nano::block_status, nano::block_context>, std::__1::allocator<std::__1::pair<nano::block_status, nano::block_context>>>, std::__1::function<void ()>)::$_2 auto nano::wrap_move_only<nano::ledger_notifications::notify_processed(nano::secure::write_transaction&, std::__1::deque<std::__1::pair<nano::block_status, nano::block_context>, std::__1::allocator<std::__1::pair<nano::block_status, nano::block_context>>>, std::__1::function<void ()>)::$_2>(nano::ledger_notifications::notify_processed(nano::secure::write_transaction&, std::__1::deque<std::__1::pair<nano::block_status, nano::block_context>, std::__1::allocator<std::__1::pair<nano::block_status, nano::block_context>>>, std::__1::function<void ()>)::$_2&&)::'lambda'(auto&&...)::operator()<>('lambda'(auto&&...)) const in /Users/runner/work/nano-node/nano-node/build/core_test
16# decltype(std::declval<nano::ledger_notifications::notify_processed(nano::secure::write_transaction&, std::__1::deque<std::__1::pair<nano::block_status, nano::block_context>, std::__1::allocator<std::__1::pair<nano::block_status, nano::block_context>>>, std::__1::function<void ()>)::$_2&&>()()) std::__1::__invoke[abi:ue170006]<auto nano::wrap_move_only<nano::ledger_notifications::notify_processed(nano::secure::write_transaction&, std::__1::deque<std::__1::pair<nano::block_status, nano::block_context>, std::__1::allocator<std::__1::pair<nano::block_status, nano::block_context>>>, std::__1::function<void ()>)::$_2>(nano::ledger_notifications::notify_processed(nano::secure::write_transaction&, std::__1::deque<std::__1::pair<nano::block_status, nano::block_context>, std::__1::allocator<std::__1::pair<nano::block_status, nano::block_context>>>, std::__1::function<void ()>)::$_2&&)::'lambda'(auto&&...)&>(auto, decltype(std::declval<nano::ledger_notifications::notify_processed(nano::secure::write_transaction&, std::__1::deque<std::__1::pair<nano::block_status, nano::block_context>, std::__1::allocator<std::__1::pair<nano::block_status, nano::block_context>>>, std::__1::function<void ()>)::$_2&&>()())&&...) in /Users/runner/work/nano-node/nano-node/build/core_test
17# void std::__1::__invoke_void_return_wrapper<void, true>::__call[abi:ue170006]<auto nano::wrap_move_only<nano::ledger_notifications::notify_processed(nano::secure::write_transaction&, std::__1::deque<std::__1::pair<nano::block_status, nano::block_context>, std::__1::allocator<std::__1::pair<nano::block_status, nano::block_context>>>, std::__1::function<void ()>)::$_2>(nano::ledger_notifications::notify_processed(nano::secure::write_transaction&, std::__1::deque<std::__1::pair<nano::block_status, nano::block_context>, std::__1::allocator<std::__1::pair<nano::block_status, nano::block_context>>>, std::__1::function<void ()>)::$_2&&)::'lambda'(auto&&...)&>(auto nano::wrap_move_only<nano::ledger_notifications::notify_processed(nano::secure::write_transaction&, std::__1::deque<std::__1::pair<nano::block_status, nano::block_context>, std::__1::allocator<std::__1::pair<nano::block_status, nano::block_context>>>, std::__1::function<void ()>)::$_2>(nano::ledger_notifications::notify_processed(nano::secure::write_transaction&, std::__1::deque<std::__1::pair<nano::block_status, nano::block_context>, std::__1::allocator<std::__1::pair<nano::block_status, nano::block_context>>>, std::__1::function<void ()>)::$_2&&)::'lambda'(auto&&...)) in /Users/runner/work/nano-node/nano-node/build/core_test
18# _ZNSt3__110__function12__alloc_funcIZN4nano14wrap_move_onlyIZNS2_20ledger_notifications16notify_processedERNS2_6secure17write_transactionENS_5dequeINS_4pairINS2_12block_statusENS2_13block_contextEEENS_9allocatorISC_EEEENS_8functionIFvvEEEE3$_2EEDaOT_EUlDpOT_E_NSD_ISQ_EESH_EclB8ue170006Ev in /Users/runner/work/nano-node/nano-node/build/core_test
19# _ZNSt3__110__function6__funcIZN4nano14wrap_move_onlyIZNS2_20ledger_notifications16notify_processedERNS2_6secure17write_transactionENS_5dequeINS_4pairINS2_12block_statusENS2_13block_contextEEENS_9allocatorISC_EEEENS_8functionIFvvEEEE3$_2EEDaOT_EUlDpOT_E_NSD_ISQ_EESH_EclEv in /Users/runner/work/nano-node/nano-node/build/core_test
20# std::__1::__function::__value_func<void ()>::operator()[abi:ue170006]() const in /Users/runner/work/nano-node/nano-node/build/core_test
21# std::__1::function<void ()>::operator()() const in /Users/runner/work/nano-node/nano-node/build/core_test
22# nano::ledger_notifications::run() in /Users/runner/work/nano-node/nano-node/build/core_test
23# nano::ledger_notifications::start()::$_0::operator()() const in /Users/runner/work/nano-node/nano-node/build/core_test
24# decltype(std::declval<nano::ledger_notifications::start()::$_0>()()) std::__1::__invoke[abi:ue170006]<nano::ledger_notifications::start()::$_0>(nano::ledger_notifications::start()::$_0&&) in /Users/runner/work/nano-node/nano-node/build/core_test
25# void std::__1::__thread_execute[abi:ue170006]<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, nano::ledger_notifications::start()::$_0>(std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, nano::ledger_notifications::start()::$_0>&, std::__1::__tuple_indices<>) in /Users/runner/work/nano-node/nano-node/build/core_test
26# void* std::__1::__thread_proxy[abi:ue170006]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, nano::ledger_notifications::start()::$_0>>(void*) in /Users/runner/work/nano-node/nano-node/build/core_test
27# _pthread_start in /usr/lib/system/libsystem_pthread.dylib

The race occurs in election::replace_by_weight():

Thread A (replace_by_weight):                Thread B (erase_election):
├─ Holds election mutex                      │
├─ Copies last_tally to sorted vector        │
├─ UNLOCKS election mutex (line 772) ◄───────┤─ Can now proceed
├─ Computing which block to replace          │
│                                            ├─ Gets election->blocks() snapshot
│                                            ├─ Calls disconnect(election)
│                                            │  └─ Erases ALL blocks from vote_router
├─ Calls disconnect(replaced_block) ◄────────┤─ Block already erased!


 - Fix assertion failure in vote_router::disconnect(hash) caused by race condition between replace_by_weight() and erase_election()                                                                                                                                                                                                                                                                                                      
  - Return bool from disconnect(hash) indicating if route existed                                                                                                                                                                                                                                                                                                                                                                         
  - Return std::size_t from disconnect(election) with number of routes removed                                                                                                                                                                                                                                                                                                                                                            
  - Reorganize vote_router files to follow project conventions

Summary:

  1. Thread A enters replace_by_weight(), unlocks the election mutex at line 772
  2. Thread B calls erase_election(), takes a snapshot of election->blocks()
  3. Thread B calls vote_router.disconnect(election) which erases all blocks including the one Thread A will try to disconnect
  4. Thread A calls vote_router.disconnect(replaced_block) - assertion fails because erased == 0

@pwojcikdev pwojcikdev force-pushed the fix-vote-router-disconnect branch from a72d565 to 73c793d Compare January 24, 2026 22:42
@gr0vity-dev-bot
Copy link
Copy Markdown

gr0vity-dev-bot commented Jan 24, 2026

Test Results for Commit 7ab66ec

Pull Request 5010: Results
Overall Status:

Test Case Results

  • 5n4pr_conf_10k_bintree: PASS (Duration: 107s)
  • 5n4pr_conf_10k_change: PASS (Duration: 134s)
  • 5n4pr_conf_change_dependant: PASS (Duration: 120s)
  • 5n4pr_conf_change_independant: PASS (Duration: 139s)
  • 5n4pr_conf_send_dependant: PASS (Duration: 124s)
  • 5n4pr_conf_send_independant: PASS (Duration: 121s)
  • 5n4pr_rocks_10k_bintree: PASS (Duration: 113s)
  • 5n4pr_rocks_10k_change: PASS (Duration: 201s)

Last updated: 2026-01-25 13:58:10 UTC

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes an intermittent assertion failure caused by a race between election::replace_by_weight() and active_elections::erase_election() by making vote_router::disconnect(...) tolerant of already-erased routes.

Changes:

  • Change vote_router::disconnect(block_hash) to return bool instead of asserting the route existed.
  • Change vote_router::disconnect(election) to return std::size_t with number of routes removed.
  • Minor header/cpp reorganization and comment/doc updates for vote_router.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
nano/node/vote_router.hpp Updates API signatures/docs; switches condition variable type to nano::condition_variable.
nano/node/vote_router.cpp Implements new return values for disconnect methods and reorders start/stop definitions.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 98 to 100
mutable std::shared_mutex mutex;
nano::condition_variable condition;
std::thread thread;
Copy link

Copilot AI Jan 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

condition was switched to nano::condition_variable while mutex remains a std::shared_mutex. When NANO_TIMED_LOCKS>0, nano::condition_variable only supports waiting on nano::unique_lock<nano::mutex>, so condition.wait_for(std::unique_lock<std::shared_mutex>&, ...) in run() will not compile. To keep timed-lock builds working, either keep std::condition_variable_any here, or refactor to use a separate nano::mutex/nano::condition_variable pair for the thread wait (and only take the std::shared_mutex when touching elections).

Copilot uses AI. Check for mistakes.
@pwojcikdev pwojcikdev force-pushed the fix-vote-router-disconnect branch from 73c793d to 7ab66ec Compare January 25, 2026 12:37
@pwojcikdev pwojcikdev merged commit ea3fb2a into nanocurrency:develop Jan 25, 2026
25 of 26 checks passed
@pwojcikdev pwojcikdev deleted the fix-vote-router-disconnect branch January 26, 2026 13:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants