Rework simulation pipeline for adaptive concurrency and connection resiliency. #648
Rework simulation pipeline for adaptive concurrency and connection resiliency. #648
Conversation
5fa23c2 to
26d735e
Compare
rajatarya
left a comment
There was a problem hiding this comment.
Code Review: Rework simulation pipeline for adaptive concurrency
Clean architectural improvement — replacing Docker-based simulation with an in-process Rust framework significantly improves developer ergonomics and CI portability. The separation of concerns (bandwidth proxy, server-side latency simulation, scenario orchestration) is well-designed. The CancellationToken parent/child pattern and RAII guards for connection tracking are done correctly. Comprehensive test coverage across unit and integration tests.
The deleted Docker scripts had hardcoded paths (/Users/hoytak/workspace/...), confirming they were developer-local tooling — nothing of value is lost.
Issues
unsafe impl Send/Sync for ServerState: unnecessary — should be derived automaticallyunsafe { std::env::set_var(...) }: unsound in multi-threaded context (Rust 1.83+)to_bytes(body, usize::MAX): no body size limit onset_confighandlerguard.take().expect(...)inrun_accept_loop: panics on double-call instead of returningResultNetworkSimulationProxyfields allpub(crate): internal state unnecessarily exposedpub fn http_client(): exposes internal HTTP client with middleware stack publiclyDirectAccessClient: 'static: breaking trait change, should be documentedbuf[..n].to_vec()in hot path: per-chunk allocation in bandwidth-limited copies- Test semantics changed:
test_session_resumeswitched from in-memory to ephemeral disk server
cas_client/src/simulation/network_simulation/bandwidth_limit_router.rs
Outdated
Show resolved
Hide resolved
| const LATENCY_PIPE_DEPTH: usize = 16; | ||
|
|
||
| /// Single struct for the network simulation proxy. Methods take `Arc<Self>` for use in async tasks. | ||
| pub struct NetworkSimulationProxy { |
There was a problem hiding this comment.
All fields of NetworkSimulationProxy are pub(crate), exposing internal state (semaphores, atomics, shutdown flags, listener) to the entire crate. These should be private — the struct already has proper public methods for all needed operations.
cas_client/src/simulation/network_simulation/bandwidth_limit_router.rs
Outdated
Show resolved
Hide resolved
|
Addressed the PR feedback -- should be better now. |
Code Review SummarySolid PR — replacing Docker-based simulation with pure Rust tooling is a clear win for developer experience and accessibility. The architecture is clean with good use of builder patterns, cancellation tokens, and separation of concerns. Given this is internal simulation tooling, most findings are minor style/hygiene items that can be deferred. Three items are worth addressing since they affect the tool's correctness or usability:
See inline comments for exact locations. |
rajatarya
left a comment
There was a problem hiding this comment.
Inline comments on items worth addressing.
cas_client/src/simulation/network_simulation/bandwidth_limit_router.rs
Outdated
Show resolved
Hide resolved
Merged main branch changes including DeletionControlableClient, SimulationControlClient, and simulation handler routes with the branch's LatencySimulation and LocalTestServerBuilder changes. Made-with: Cursor
|
All issues should be fixed. |
rajatarya
left a comment
There was a problem hiding this comment.
Sorry didn't see everything got addressed, thanks!
…-885) PR #648 refactored LocalTestServer to use a builder pattern. Update the shard upload timeout integration tests to use LocalTestServerBuilder instead of LocalTestServer::start(). Also call set_api_delay_range directly on the server instead of going through .client(). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This PR replaces the previous collection of scripts around setting up docker containers with a much more nimble and lightweight set of rust scripts and a simple, reusable proxy that can limit bandwidth and congestion simulations. The previous scripts are rewritten to be more nimble and use more reusable components.
New tools:
To run the new simulation scripts for the adaptive concurrency on upload, compile in release mode and run one of the scripts in
simulation/src/adaptive_concurrency/scripts/. Docker is no longer needed to run any of the simulations.The old
cas_client/tests/adaptive_concurrency/paths were removed.