Refactor NetSyncServer and ConnectionManager for Priority-Based Sending#316
Refactor NetSyncServer and ConnectionManager for Priority-Based Sending#316
Conversation
- Introduced a new outbound packet structure to manage control messages (RPC, Network Variables) and transform updates with priority. - Updated NetSyncServer to handle control messages via ROUTER unicast for reliable delivery, replacing PUB/SUB for critical messages. - Enhanced ConnectionManager to support a control message queue with TTL and backpressure handling. - Modified TransformSyncManager to return SendOutcome for send operations, indicating success, backpressure, or fatal errors. - Implemented logging for backpressure events to monitor network performance and improve responsiveness under load.
There was a problem hiding this comment.
Pull request overview
Refactors NetSync sending to prioritize control traffic (RPC/NV) over transform updates, adds backpressure-aware outcomes/logging on the Unity side, and switches server control-message fanout from PUB/SUB to ROUTER unicast.
Changes:
- Unity: send control (NV/RPC) before transforms and add backpressure tracking/logging + handshake send outcome handling.
- Server: introduce ROUTER control queue/drain loop and route RPC/NV/ID-mapping via ROUTER unicast with new stats.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| STYLY-NetSync-Unity/Packages/com.styly.styly-netsync/Runtime/NetSyncManager.cs | Reorders send priorities and adds SendOutcome-based backpressure handling/log throttling. |
| STYLY-NetSync-Server/src/styly_netsync/server.py | Adds ROUTER control-message queuing/draining and switches RPC/NV/ID-mapping delivery to ROUTER unicast with stats. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
STYLY-NetSync-Unity/Packages/com.styly.styly-netsync/Runtime/NetSyncManager.cs
Show resolved
Hide resolved
|
@copilot Review the latest code. |
|
@claude Review the latest code. Backward compatibility is not needed. |
|
Claude finished @from2001's task in 2m 35s —— View job Code Review: Priority-Based Sending RefactorI've completed a comprehensive review of the priority-based sending implementation. Here are my findings: 🎯 Architecture & Design✅ Strengths
🔍 Issues Found🐛 Critical Issues1. Incomplete Drop Counting in Server (server.py:644-661)Location: The first except Full:
# Ring-buffer behavior: drop oldest to make room for newest
try:
_ = self._router_queue_ctrl.get_nowait() # ← This drop is NOT counted
except Empty:
pass
else:
# Count and log the drop of the oldest message
self._increment_stat("ctrl_unicast_dropped") # ← Should also increment aboveImpact: Metrics underreport actual message loss 2. Race Condition in Transform Clear (ConnectionManager.cs:440)Location: The // Try to send
if (TrySendDealer(dealer, packet.RoomId, packet.Payload))
{
// ← NEW PACKET COULD ARRIVE HERE
// This clears the slot only if it still contains this exact packet
Interlocked.CompareExchange(ref _latestTransform, null, packet);
return true;
}Current behavior: Works correctly - newer packet is preserved Recommendation: Add a comment explaining why this pattern is safe: // Success - clear the packet using compare-exchange
// This ensures we don't clear a newer packet that was set during send
Interlocked.CompareExchange(ref _latestTransform, null, packet);3. Misleading Docstring (server.py:628-631)Location: The docstring claims "benefit from more reliable delivery" but the implementation explicitly drops messages under backpressure (ring buffer) and on """Thread-safe enqueue of a control message for unicast via ROUTER.
Used for control-like messages (RPC, NV sync, ID mapping) that benefit
from more reliable delivery than PUB/SUB, but are still subject to loss
under backpressure and send failures (ring-buffer drop-on-full and
non-blocking router drain).Recommendation: Update to clarify actual guarantees as suggested in Copilot's review
|
| Severity | Count | Examples |
|---|---|---|
| Critical | 3 | Drop counting, docstring accuracy, race condition clarity |
| High | 3 | Lock contention, dropped message retry, exception handling |
| Medium | 2 | Logging before outcome check, TTL logging |
| Low | 2 | Magic numbers, Interlocked consistency |
Overall Assessment
This is a solid refactor that correctly implements priority-based sending with good separation of concerns. The main issues are:
- Metrics accuracy (drop counting)
- Documentation clarity (docstrings, comments)
- Observability (logging for TTL drops, backpressure)
None of the issues are blockers, but fixing #1, #3, #6, and #7 will improve production debuggability.
Copilot's Concern About Backward Compatibility: Since you stated backward compatibility is not needed, switching RPC/NV/ID-mapping from PUB/SUB to ROUTER is acceptable. Python client will need updates in future PRs, but that's expected with a breaking change.
|
@claude All issues above were fixed. Review the latest code. |
|
Claude finished @from2001's task in 1m 33s —— View job Code Review: All Issues Successfully Resolved ✅I've completed a thorough review of the latest code changes. Great work on addressing all the feedback from the previous review! 📋 Summary of FixesAll 12 previously identified issues have been successfully addressed across commits ✅ Critical Issues - ALL FIXEDIssue #1: Incomplete Drop Counting in Server ✅ FIXEDLocation: Previous problem: The first Fix verified: except Empty:
# Queue became empty between Full and get_nowait (rare race condition)
pass
else:
# Count and log the drop of the oldest message
self._increment_stat("ctrl_unicast_dropped")
logger.debug("Router control queue full: dropping oldest control message")✅ Status: Both drops are now correctly counted (line 654 for old message, line 664 for new message if still full). The same fix was applied to PUB queue (line 619). Issue #2: Race Condition Documentation ✅ FIXEDLocation: Previous problem: CompareExchange logic was subtle and could be misunderstood. Fix verified: // Success - clear the packet using compare-exchange.
// CompareExchange atomically sets _latestTransform to null ONLY if it still equals 'packet'.
// This is safe against race conditions: if another thread called SetLatestTransform()
// during our send, _latestTransform will hold the newer packet (not 'packet'),
// so CompareExchange will fail and preserve the newer packet for the next send cycle.
Interlocked.CompareExchange(ref _latestTransform, null, packet);✅ Status: Excellent detailed comment explaining the thread-safety guarantee. Issue #3: Misleading Docstring ✅ FIXEDLocation: Previous problem: Claimed "reliable delivery" but implementation drops messages. Fix verified: """Thread-safe enqueue of a control message for unicast via ROUTER.
Used for control-like messages (RPC, NV sync, ID mapping) that are
more reliable than PUB/SUB, but still drop under backpressure and
send failures (ring-buffer drop-on-full and non-blocking router drain).✅ Status: Docstring now accurately describes the actual guarantees. ✅ High Priority Issues - ALL FIXEDIssue #4: Lock Contention Window ✅ FIXEDLocation: Previous problem: Holding Fix verified: identities_to_send: list[bytes] = []
with self._rooms_lock:
# ... collect identities ...
identities_to_send.append(identity)
# Enqueue control messages outside of the rooms lock to reduce contention
for identity in identities_to_send:
self._enqueue_router(identity, room_id, message_bytes)✅ Status: Lock is now released before enqueueing, minimizing contention. Issue #5: Silent Exception Swallowing ✅ FIXEDLocation: Previous problem: Empty exception handler with insufficient explanation. Fix verified: except Empty:
# Queue became empty between Full and get_nowait (rare race condition)
pass✅ Status: Comment now clearly explains the benign race condition. Issue #6: Control Queue Comment Clarity ✅ FIXEDLocation: Previous problem: Comment incorrectly mentioned "retry" when packets are dropped. Fix verified: // The packet is DROPPED on backpressure (not retried) for the following reasons:
// 1. Re-enqueue at the front is not possible with ConcurrentQueue
// 2. Re-enqueuing at the back would cause out-of-order delivery
// 3. Retrying could cause infinite loops if backpressure persists
// This is acceptable for control messages as they are idempotent or timestamped.
// In practice, the HWM should rarely be hit with proper flow control.✅ Status: Comment now accurately describes that packets are dropped, not retried. ✅ Medium Priority Issues - ALL FIXEDIssue #7: Logging Before Outcome Check ✅ FIXEDLocation: Previous problem: Debug logs reported "Sent" before checking SendOutcome. Fix verified: The logging structure was improved:
✅ Status: Logs now accurately reflect the actual outcome of send operations. Issue #8: TTL Expiration Lacks Logging ✅ FIXEDLocation: Previous problem: TTL-expired packets were dropped silently. Fix verified: if (now - packet.EnqueuedAt > CTRL_TTL_SECONDS)
{
Debug.LogWarning($"[ConnectionManager] Control packet expired (TTL {CTRL_TTL_SECONDS}s exceeded)");
continue;
}✅ Status: Now logs a warning when control packets expire, aiding debugging. 🎯 Additional Improvements ObservedBeyond fixing the identified issues, I noticed these quality improvements:
🔍 Final Verification - No Regressions DetectedI reviewed the changed code for any new issues and found:
📊 Summary
Total Issues Addressed: 8/8 actionable fixes ✅ 🎉 ConclusionAll critical, high-priority, and medium-priority issues from the previous review have been thoroughly addressed. The code quality improvements include:
The priority-based sending refactor is now production-ready from a code quality perspective. The implementation correctly prioritizes control messages over transforms, handles backpressure gracefully, and provides clear visibility into system behavior through comprehensive logging. Recommendation: ✅ Ready for merge (pending any final integration testing) |
|
@gemini-cli Review this PR. |
|
🤖 Hi @from2001, I've received your request, and I'm working on it now! You can track my progress in the logs for more details. |
Close #315