fix(node/sysgo-tests): attempt to fix race condition in p2p disconnect rpc call#2761
fix(node/sysgo-tests): attempt to fix race condition in p2p disconnect rpc call#2761
Conversation
|
Claude finished @theochap's task —— View job PR Review Complete ✅Todo List:
Code Review Feedback✅ Overall AssessmentThis is a solid fix for a legitimate race condition issue. The implementation follows existing patterns in the codebase and addresses the core problem effectively. 🔧 Code Quality & Best PracticesStrengths:
Minor Issues:
🐛 Potential Bugs/IssuesLow Risk:
⚡ Performance ConsiderationsGood:
Considerations:
🔒 Security ConcernsNo Major Issues Identified:
🧪 Test Coverage AssessmentCurrent State:
Recommendations:
📋 Suggested Improvements
✅ RecommendationAPPROVE with minor suggestions. This PR successfully addresses the race condition issue with a well-implemented solution that follows existing patterns. The suggested improvements are non-blocking and can be addressed in follow-up PRs if desired. |
There was a problem hiding this comment.
Pull Request Overview
This PR addresses a race condition in the opp2p_disconnect RPC call by ensuring the disconnection is fully completed before returning. The fix implements retry logic with exponential backoff to wait for peer disconnection confirmation.
Key Changes
- Added retry mechanism with exponential backoff to verify peer disconnection
- Implemented peer connection status checking after disconnect request
- Added proper error handling for timeout scenarios
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| if !is_not_connected | ||
| .retry(ExponentialBuilder::default().with_total_delay(Some(Duration::from_secs(10)))) | ||
| .await? |
There was a problem hiding this comment.
The retry logic imports are missing. This code uses ExponentialBuilder and Duration which require imports that aren't visible in this diff. Ensure the necessary imports are added at the top of the file.
| let is_not_connected = async || { | ||
| let (tx, rx) = tokio::sync::oneshot::channel(); | ||
|
|
||
| self.sender | ||
| .send(P2pRpcRequest::Peers { out: tx, connected: true }) | ||
| .await | ||
| .map_err(|_| ErrorObject::from(ErrorCode::InternalError))?; | ||
|
|
||
| let peers = rx.await.map_err(|_| { | ||
| ErrorObject::borrowed(ErrorCode::InternalError.code(), "Failed to get peers", None) | ||
| })?; | ||
|
|
||
| Ok::<bool, ErrorObject<'_>>(!peers.peers.contains_key(&peer_id.to_string())) | ||
| }; | ||
|
|
||
| if !is_not_connected |
There was a problem hiding this comment.
The closure creates a new oneshot channel on every retry attempt. Consider moving the peer checking logic to a separate method to avoid repeated channel creation overhead during retries.
| let is_not_connected = async || { | |
| let (tx, rx) = tokio::sync::oneshot::channel(); | |
| self.sender | |
| .send(P2pRpcRequest::Peers { out: tx, connected: true }) | |
| .await | |
| .map_err(|_| ErrorObject::from(ErrorCode::InternalError))?; | |
| let peers = rx.await.map_err(|_| { | |
| ErrorObject::borrowed(ErrorCode::InternalError.code(), "Failed to get peers", None) | |
| })?; | |
| Ok::<bool, ErrorObject<'_>>(!peers.peers.contains_key(&peer_id.to_string())) | |
| }; | |
| if !is_not_connected | |
| let peer_id_str = peer_id.to_string(); | |
| let is_not_connected = || async { | |
| self.is_peer_disconnected(&peer_id_str).await | |
| }; | |
| if !is_not_connected() |
| return Err(ErrorObject::borrowed( | ||
| ErrorCode::InvalidParams.code(), |
There was a problem hiding this comment.
The error code InvalidParams is misleading for a timeout scenario. This should use InternalError or a more appropriate error code since the parameters were valid but the operation timed out.
| return Err(ErrorObject::borrowed( | |
| ErrorCode::InvalidParams.code(), | |
| ErrorCode::InternalError.code(), |
Codecov Report❌ Patch coverage is
☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…t rpc call (#2761) ## Description It seems the `opp2p_disconnect` rpc call returns too early. This method ensures that we're checking the peers are effectively not connected to each other in the `opp2p_peers` call.
…t rpc call (#2761) ## Description It seems the `opp2p_disconnect` rpc call returns too early. This method ensures that we're checking the peers are effectively not connected to each other in the `opp2p_peers` call.
41529ba to
ab3456c
Compare
…t rpc call (op-rs/kona#2761) ## Description It seems the `opp2p_disconnect` rpc call returns too early. This method ensures that we're checking the peers are effectively not connected to each other in the `opp2p_peers` call.
…t rpc call (op-rs/kona#2761) ## Description It seems the `opp2p_disconnect` rpc call returns too early. This method ensures that we're checking the peers are effectively not connected to each other in the `opp2p_peers` call.
Description
It seems the
opp2p_disconnectrpc call returns too early. This method ensures that we're checking the peers are effectively not connected to each other in theopp2p_peerscall.