Add --runtime_info_file flag for ephemeral port discovery#5013
Add --runtime_info_file flag for ephemeral port discovery#5013pwojcikdev merged 3 commits intonanocurrency:developfrom
--runtime_info_file flag for ephemeral port discovery#5013Conversation
Test Results for Commit 1fe6305Pull Request 5013: Results Test Case Results
Last updated: 2026-01-27 04:32:06 UTC |
There was a problem hiding this comment.
Pull request overview
This pull request adds support for discovering ephemeral ports through a new --runtime_info_file flag and improves the existing --pid_file flag by implementing automatic cleanup on process exit.
Changes:
- Introduces a runtime file management system that registers files for automatic cleanup on process exit using
std::atexit() - Adds
--runtime_info_fileCLI flag that writes a JSON file containing the node's actual bound ports (peering_port, rpc_port) and node_id, particularly useful when using ephemeral ports (port=0) - Updates system tests to use ephemeral ports and the new runtime_info_file for discovering actual bound ports, eliminating hardcoded ports that caused test flakiness
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| nano/lib/runtime_files.hpp | New header defining runtime file management API for automatic cleanup of PID and runtime info files |
| nano/lib/runtime_files.cpp | Implementation of runtime file management with thread-safe registration and atexit-based cleanup |
| nano/lib/CMakeLists.txt | Added new runtime_files source files to build |
| nano/node/nodeconfig.hpp | Added runtime_info_file optional field to node_flags |
| nano/node/cli.cpp | Added --runtime_info_file CLI option with implicit value support |
| nano/node/network.cpp | Minor style change to use brace initialization for endpoint |
| nano/nano_node/daemon.hpp | Updated parameter name for clarity |
| nano/nano_node/daemon.cpp | Integrated runtime_info_file creation after node and RPC startup |
| nano/nano_node/entry.cpp | Refactored PID file handling to use new runtime_files infrastructure |
| systest/runtime_files.sh | New comprehensive test for PID file and runtime info file creation and cleanup |
| systest/rpc_stop.sh | Updated to use ephemeral ports and runtime_info_file for port discovery |
| systest/daemon_interrupt.sh | Updated to use ephemeral ports and runtime_info_file for readiness detection |
| systest/bind_failure.sh | Updated to use ephemeral ports and runtime_info_file instead of hardcoded ports |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| nano::runtime_files::create_pid_file (pid_filepath); | ||
| std::cerr << "PID file created: " << pid_filepath << std::endl; |
There was a problem hiding this comment.
Similar to the runtime_info_file handling in daemon.cpp, if create_pid_file() throws an exception, there's no specific error handling. The exception will propagate and may not provide a clear error message to the user about what went wrong with PID file creation. Consider adding a try-catch block to provide a more informative error message.
| nano::runtime_files::create_pid_file (pid_filepath); | |
| std::cerr << "PID file created: " << pid_filepath << std::endl; | |
| try | |
| { | |
| nano::runtime_files::create_pid_file (pid_filepath); | |
| std::cerr << "PID file created: " << pid_filepath << std::endl; | |
| } | |
| catch (std::exception const & e) | |
| { | |
| std::cerr << "Failed to create PID file '" << pid_filepath << "': " << e.what () << std::endl; | |
| return 1; | |
| } |
nano/lib/runtime_files.cpp
Outdated
| { | ||
| std::lock_guard<std::mutex> lock (mutex); | ||
| registered_files.erase (path); | ||
| } | ||
|
|
||
| std::error_code ec; | ||
| std::filesystem::remove (path, ec); | ||
| // Ignore errors - file may not exist |
There was a problem hiding this comment.
The lock is released before the file is removed from the filesystem at line 66. This means another thread could re-insert the same path into registered_files between the erase and the actual file removal. While this is unlikely to cause issues in practice (since the file removal is best-effort anyway), it creates a small window where the state is inconsistent. Consider moving the filesystem removal inside the locked section, or document this behavior if it's intentional for performance reasons.
| { | |
| std::lock_guard<std::mutex> lock (mutex); | |
| registered_files.erase (path); | |
| } | |
| std::error_code ec; | |
| std::filesystem::remove (path, ec); | |
| // Ignore errors - file may not exist | |
| std::error_code ec; | |
| { | |
| std::lock_guard<std::mutex> lock (mutex); | |
| registered_files.erase (path); | |
| std::filesystem::remove (path, ec); | |
| // Ignore errors - file may not exist | |
| } |
| std::mutex mutex; | ||
| std::set<std::filesystem::path> registered_files; | ||
| bool atexit_registered = false; |
There was a problem hiding this comment.
The atexit_registered flag should be std::atomic<bool> to ensure thread safety. Currently, it's a plain bool that's accessed under a mutex lock, but the mutex is only held during the registration check and insert operation. If multiple threads call create() concurrently, there's a potential race condition where the mutex is released after checking atexit_registered but before std::atexit() completes, allowing another thread to register the handler multiple times. Using std::atomic<bool> would provide proper memory ordering guarantees.
| out << contents; | ||
| out.close (); |
There was a problem hiding this comment.
The file write operation should verify that the write succeeded before closing. Currently, out << contents may fail silently (e.g., disk full, permissions issue after opening). Consider checking the stream state after writing and before closing to ensure data was actually written successfully.
| out << contents; | |
| out.close (); | |
| out << contents; | |
| if (!out) | |
| { | |
| throw std::runtime_error ("Unable to write to runtime file: " + path.string ()); | |
| } | |
| out.close (); | |
| if (!out) | |
| { | |
| throw std::runtime_error ("Unable to close runtime file: " + path.string ()); | |
| } |
| void nano::runtime_files::create_pid_file (std::filesystem::path const & path) | ||
| { | ||
| auto pid = boost::this_process::get_id (); | ||
| nano::runtime_files::create (path, std::to_string (pid)); |
There was a problem hiding this comment.
The PID file content is written without a trailing newline, but when it's read in the test at systest/runtime_files.sh:38, the newline is explicitly stripped with tr -d '\n'. However, this is inconsistent with typical PID file conventions which usually include a trailing newline. Consider adding a newline after the PID for consistency with standard PID file format (i.e., change to std::to_string (pid) + "\n").
| nano::runtime_files::create (path, std::to_string (pid)); | |
| nano::runtime_files::create (path, std::to_string (pid) + "\n"); |
| nano::runtime_files::create_runtime_info (*flags.runtime_info_file, info); | ||
| std::cerr << "Runtime info file created: " << *flags.runtime_info_file << std::endl; |
There was a problem hiding this comment.
If create_runtime_info() throws an exception (e.g., due to disk full, permission issues, or invalid path), the node will terminate via the catch block at line 225. However, the node will have already started and bound to ports. Consider wrapping this in a try-catch block to provide a more specific error message, or documenting that runtime_info_file creation failures are fatal and will terminate the node during startup.
| nano::runtime_files::create_runtime_info (*flags.runtime_info_file, info); | |
| std::cerr << "Runtime info file created: " << *flags.runtime_info_file << std::endl; | |
| try | |
| { | |
| nano::runtime_files::create_runtime_info (*flags.runtime_info_file, info); | |
| std::cerr << "Runtime info file created: " << *flags.runtime_info_file << std::endl; | |
| } | |
| catch (std::exception const & ex) | |
| { | |
| logger.critical ( | |
| nano::log::type::daemon, | |
| "Failed to create runtime info file '{}': {}", | |
| *flags.runtime_info_file, | |
| ex.what ()); | |
| // Rethrow so that startup still fails as before | |
| throw; | |
| } |
9b83e98 to
b708b54
Compare
b708b54 to
1fe6305
Compare
New CLI Options
--runtime_info_file [path]Writes a JSON file containing runtime information after the node starts. This is particularly useful when running with ephemeral ports (peering_port=0, rpc.port=0) where the actual bound ports are assigned by the OS.
Usage:
JSON output:
Notes:
--pid_file [path] (improved)Now uses the same automatic cleanup infrastructure. File is automatically deleted on process exit.
This should fix system test flakiness.