refactor: stop EndpointDriver once endpoint is closed and all connections are drained#426
Conversation
…ill Endpoint structs alive
|
Documentation for this PR has been generated and is available at: https://n0-computer.github.io/quinn/pr/426/docs/iroh_quinn/ Last updated: 2026-02-13T11:32:39Z |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #426 +/- ##
==========================================
- Coverage 77.35% 77.30% -0.05%
==========================================
Files 81 81
Lines 23351 23356 +5
==========================================
- Hits 18062 18056 -6
- Misses 5289 5300 +11 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Performance Comparison Report
|
| Scenario | iroh-quinn | upstream | Delta | CPU (avg/max) |
|---|---|---|---|---|
| large-single | 5340.6 Mbps | 7885.1 Mbps | -32.3% | 95.4% / 105.0% |
| medium-concurrent | 5355.6 Mbps | 7945.1 Mbps | -32.6% | 93.8% / 104.0% |
| medium-single | 4126.0 Mbps | 4749.8 Mbps | -13.1% | 97.5% / 158.0% |
| small-concurrent | 3866.4 Mbps | 4951.2 Mbps | -21.9% | 99.3% / 161.0% |
| small-single | 3558.9 Mbps | 4839.4 Mbps | -26.5% | 95.2% / 108.0% |
Netsim Benchmarks (network simulation)
| Condition | iroh-quinn | upstream | Delta |
|---|---|---|---|
| ideal | 3049.1 Mbps | 3656.1 Mbps | -16.6% |
| lan | 782.5 Mbps | 796.4 Mbps | -1.7% |
| lossy | 69.8 Mbps | 69.8 Mbps | ~0% |
| wan | 83.8 Mbps | 83.8 Mbps | ~0% |
Summary
iroh-quinn is 25.0% slower on average
272f7e258c2627f9582eb5937de0b9d2a7db1e8c - artifacts
Raw Benchmarks (localhost)
| Scenario | iroh-quinn | upstream | Delta | CPU (avg/max) |
|---|---|---|---|---|
| large-single | 5599.6 Mbps | 7933.6 Mbps | -29.4% | 95.8% / 106.0% |
| medium-concurrent | 5415.6 Mbps | 7921.6 Mbps | -31.6% | 95.6% / 105.0% |
| medium-single | 3921.9 Mbps | 4749.1 Mbps | -17.4% | 90.0% / 97.6% |
| small-concurrent | 3859.7 Mbps | 5203.4 Mbps | -25.8% | 95.1% / 106.0% |
| small-single | 3461.4 Mbps | 4731.8 Mbps | -26.8% | 98.2% / 161.0% |
Summary
iroh-quinn is 27.1% slower on average
824f1b609253069fcb377d2befb4ce7134948566 - artifacts
Raw Benchmarks (localhost)
| Scenario | iroh-quinn | upstream | Delta | CPU (avg/max) |
|---|---|---|---|---|
| large-single | 5386.9 Mbps | N/A | N/A | 88.1% / 95.3% |
| medium-concurrent | 5402.9 Mbps | N/A | N/A | 92.0% / 97.2% |
| medium-single | 3807.8 Mbps | N/A | N/A | 94.3% / 120.0% |
| small-concurrent | 3816.9 Mbps | N/A | N/A | 86.1% / 97.0% |
| small-single | 3343.5 Mbps | N/A | N/A | 89.7% / 97.6% |
Netsim Benchmarks (network simulation)
| Condition | iroh-quinn | upstream | Delta |
|---|---|---|---|
| ideal | 2963.2 Mbps | N/A | N/A |
| lan | 782.5 Mbps | N/A | N/A |
| wan | 83.8 Mbps | N/A | N/A |
b693e9b0cc1cca4bd6b558993410d3d33eda6aaa - artifacts
Raw Benchmarks (localhost)
| Scenario | iroh-quinn | upstream | Delta | CPU (avg/max) |
|---|---|---|---|---|
| large-single | 5367.2 Mbps | 7971.4 Mbps | -32.7% | 90.5% / 97.0% |
| medium-concurrent | 5165.2 Mbps | 7802.9 Mbps | -33.8% | 90.0% / 96.2% |
| medium-single | 4044.4 Mbps | 4749.0 Mbps | -14.8% | 94.5% / 123.0% |
| small-concurrent | 3773.1 Mbps | 5274.9 Mbps | -28.5% | 87.8% / 97.4% |
| small-single | 3438.6 Mbps | 4824.3 Mbps | -28.7% | 94.0% / 126.0% |
Netsim Benchmarks (network simulation)
| Condition | iroh-quinn | upstream | Delta |
|---|---|---|---|
| ideal | 2947.0 Mbps | N/A | N/A |
| lan | 782.4 Mbps | N/A | N/A |
| lossy | 69.8 Mbps | N/A | N/A |
| wan | 83.8 Mbps | N/A | N/A |
Summary
iroh-quinn is 28.8% slower on average
flub
left a comment
There was a problem hiding this comment.
Seems reasonable to me. IIUC now new incoming datagrams remain queued in the kernel's buffers for the socket now instead of being put into our memory while the driver is still alive. This is probably desirable.
divagant-martian
left a comment
There was a problem hiding this comment.
LGTM.
Excited to finally be able to modify this behaviour
## Description Based on #3879 Depends on n0-computer/noq#426 With n0-computer/noq#426 the quinn::EndpointDriver stops once the quinn::Endpoint is closed and all connections are drained. Thus we don't need the `RwLock` around the `quinn::Endpoint` anymore. See [this thread](#3879 (comment)) for details. * Add `Endpoint::closed`, which returns a `CancellationToken` that is cancelled once the endpoint is closing * Document watcher behavior with regards to closing the endpoint * Fix test ## Notes & open questions * Instead of returning a `CancellationToken` we could make this an `async fn`, however to me it looks quite useful to get a `CancellationToken` and be able to use `tokio::spawn(endpoint_closed.run_until_cancelled_owned(async move { .. })))` * The cancellation token is cancelled when `Endpoint::close` *starts*, not when it *ends*. I wasn't sure what expected behavior is here. I opted for start because this guarantees that the endpoint is fully usable until then.
## Description Based on #3879 Depends on n0-computer/noq#426 With n0-computer/noq#426 the quinn::EndpointDriver stops once the quinn::Endpoint is closed and all connections are drained. Thus we don't need the `RwLock` around the `quinn::Endpoint` anymore. See [this thread](#3879 (comment)) for details. * Add `Endpoint::closed`, which returns a `CancellationToken` that is cancelled once the endpoint is closing * Document watcher behavior with regards to closing the endpoint * Fix test ## Notes & open questions * Instead of returning a `CancellationToken` we could make this an `async fn`, however to me it looks quite useful to get a `CancellationToken` and be able to use `tokio::spawn(endpoint_closed.run_until_cancelled_owned(async move { .. })))` * The cancellation token is cancelled when `Endpoint::close` *starts*, not when it *ends*. I wasn't sure what expected behavior is here. I opted for start because this guarantees that the endpoint is fully usable until then.
## Description Based on #3879 Depends on n0-computer/noq#426 With n0-computer/noq#426 the quinn::EndpointDriver stops once the quinn::Endpoint is closed and all connections are drained. Thus we don't need the `RwLock` around the `quinn::Endpoint` anymore. See [this thread](#3879 (comment)) for details. * Add `Endpoint::closed`, which returns a `CancellationToken` that is cancelled once the endpoint is closing * Document watcher behavior with regards to closing the endpoint * Fix test ## Notes & open questions * Instead of returning a `CancellationToken` we could make this an `async fn`, however to me it looks quite useful to get a `CancellationToken` and be able to use `tokio::spawn(endpoint_closed.run_until_cancelled_owned(async move { .. })))` * The cancellation token is cancelled when `Endpoint::close` *starts*, not when it *ends*. I wasn't sure what expected behavior is here. I opted for start because this guarantees that the endpoint is fully usable until then.
Description
Currently, the
quinn::EndpointDriveris only stopped once allEndpointstructs are dropped. This changes it such that ifEndpoint::closehas been called, and all connections are drained, then the endpoint driver stops.I think the change is fine, upon some investigation I couldn't find anything that would change in behavior apart from that potentially it would poll less often for new
Incomings. However, the notification for newIncomings in theAcceptfuture already is only registered if the endpoint is not closed. So I don't think this actually changes anything.This change will make our life in iroh much easier, see n0-computer/iroh#3879: We no longer have to ensure that the
quinn::Endpointis dropped to wait for the driver task to stop.Breaking Changes
Notes & open questions