Fix connection leaking on cancellation#26178
Conversation
| } | ||
|
|
||
| #if DEBUG | ||
| private string _creationCallStack; |
There was a problem hiding this comment.
Why only debug? still seems useful in release.
| if (!Environment.HasShutdownStarted) | ||
| { | ||
| Contract.Requires(false, $@"Should have been disposed!"); | ||
| Contract.Requires(false, $"Should have been disposed!\r\n {_creationCallStack}"); |
…be leaked not disposed. fixed the issue and added test
|
@jinujoseph can we bring this to 15.7? @dotnet/roslyn-ide @dotnet/roslyn-analysis can you take a look? |
| return null; | ||
| } | ||
|
|
||
| var scope = await GetPinnedScopeAsync(solution, cancellationToken).ConfigureAwait(false); |
There was a problem hiding this comment.
issue is this throwing cancellation and connection not get disposed.
|
retest windows_debug_vs-integration_prtest please |
|
LB issue is known one. I will take a look why sometimes LB doesn't show up. |
sharwell
left a comment
There was a problem hiding this comment.
Need to take a closer look tomorrow.
| var connection = new MyConnection(source); | ||
| try | ||
| { | ||
| await SessionWithSolution.CreateAsync(connection, solution, source.Token); |
There was a problem hiding this comment.
❕ Remove the try/catch here. Use Assert.ThrowsAsync if the code throws.
| // make sure things that should have been cleaned up are cleaned up | ||
| var service = (RemotableDataServiceFactory.Service)solution.Workspace.Services.GetService<IRemotableDataService>(); | ||
| Assert.Null(service.GetRemotableData_TestOnly(solutionChecksum, CancellationToken.None)); | ||
| Assert.True(connection.Disposed); |
There was a problem hiding this comment.
❓ why would this be true here? The ownership transfer isn't clear and the test method doesn't dispose of the object, but I'm also reviewing this on my phone so maybe I just need a better view.
There was a problem hiding this comment.
whole point of this test is if it throws connection get disposed correctly.
| if (!Environment.HasShutdownStarted) | ||
| { | ||
| Contract.Requires(false, $@"Should have been disposed!"); | ||
| #if DEBUG |
There was a problem hiding this comment.
📝 If this had wrapped the entire finalizer from the start, this bug would be much less serious.
There was a problem hiding this comment.
not sure what you meant.
| #if DEBUG | ||
| Contract.Fail($"Should have been disposed!\r\n {_creationCallStack}"); | ||
| #else | ||
| Contract.Fail($"Should have been disposed!"); |
There was a problem hiding this comment.
is there a reason these are not NFWs?
There was a problem hiding this comment.
Fail will crash VS.
| protected Connection() | ||
| { | ||
| #if DEBUG | ||
| _creationCallStack = Environment.StackTrace; |
There was a problem hiding this comment.
why only do this in debug? we create connections rarely, and then pool them, right? so this should be cheap enough to do even in release.
There was a problem hiding this comment.
no connection is created many times. and we see this issue again we can make this release later.
|
ping? can I get code review? |
|
@dotnet/roslyn-infrastructure https://ci.dot.net/job/dotnet_roslyn/job/dev15.7.x/job/windows_debug_spanish_unit32_prtest/158/ |
|
retest windows_debug_spanish_unit32_prtest please |
|
retest windows_release_vs-integration_prtest please |
sharwell
left a comment
There was a problem hiding this comment.
The direction here is fine, but there are a few more clean-up items to polish the test.
| _source.Cancel(); | ||
| _source.Token.ThrowIfCancellationRequested(); | ||
|
|
||
| return Task.CompletedTask; |
There was a problem hiding this comment.
💡
throw ExceptionUtilities.Unreachable;
| public override Workspace Workspace => _workspace; | ||
| } | ||
|
|
||
| private class MyConnection : RemoteHostClient.Connection |
There was a problem hiding this comment.
❕ Rename type to unambiguously indicate its test behavior (e.g. InvokeThrowsCancellationConnection).
| { | ||
| sessionWithSolution.Dispose(); | ||
| // make sure disposable objects are disposed when | ||
| // exceptions are thrown |
There was a problem hiding this comment.
💡 This comment is unnecessary and can be removed
There was a problem hiding this comment.
this tells what this test is trying to do.
|
|
||
| var source = new CancellationTokenSource(); | ||
| var connection = new MyConnection(source); | ||
| await Assert.ThrowsAnyAsync<OperationCanceledException>(() => SessionWithSolution.CreateAsync(connection, solution, source.Token)); |
There was a problem hiding this comment.
❕ This should invoke the entry point for the error RemoteHostClientExtensions.TryCreateSessionAsync rather than using the inner helper methods from SessionWithSolution.
There was a problem hiding this comment.
that sure why. the bug is not there. but in CreateAsync.
There was a problem hiding this comment.
Testing of TryCreateSessionAsync would ensure that the complete sequence leading to the bug is corrected:
- The connection is created
- Something happens
- The connection fails to be disposed
As-written, the test does not cover the first step of the scenario. I'll unblock because it may be substantially more work to write the more complete test.
There was a problem hiding this comment.
I believe I moved (2) down to CreateAsync. and nothing left in TryCreateSessionAsync. but if there is a bug, we still have Contract.Fail.
| } | ||
| } | ||
|
|
||
| [Fact, Trait(Traits.Feature, Traits.Features.RemoteHost)] |
There was a problem hiding this comment.
❗️ Missing [WorkItem(...)]
|
ping? need some code review! @jinujoseph |
|
retest windows_debug_vs-integration_prtest please |
|
|
||
| var source = new CancellationTokenSource(); | ||
| var connection = new InvokeThrowsCancellationConnection(source); | ||
| await Assert.ThrowsAnyAsync<OperationCanceledException>(() => SessionWithSolution.CreateAsync(connection, solution, source.Token)); |
There was a problem hiding this comment.
Nit: assert that the resulting exception (which is returned so you can assert with it) has OperationCanceledException.CancellationToken equal to source.CancellationToken. Since there's some catching/rethrowing going on we don't want to misassociate the cancellation source.
There was a problem hiding this comment.
is there Assert.ThrowXXX that let me get to the exception? let me take a look.
There was a problem hiding this comment.
is there Assert.ThrowXXX that let me get to the exception
Yes, they all return the exception instance.
|
@heejaechang Did you see these? |
|
@jinujoseph are we going to take this for 15.7? |
How is this occurring? I would expect the GC to reclaim the connections when they are no longer referenced. |
|
@heejaechang , lets target this for 15.8 |
probably. I didn't look through servicehub's stream implementation to see whether it is safe to let go without dispose and let finalizer to take care of disposing it. regardless, it is resource that is supposed to be disposed and I am not disposing it. so it is certainly leak. |
|
@jinujoseph do we have 15.8 branch? or master? |
|
15.8 - > master |
|
Just waiting on #26178 (comment) |
|
💡 @heejaechang It would be good to revise the Scenario and Performance Impact sections of the original template to more accurately explain the change. Specifically, this is not a performance problem caused by an increasing number of connections, but rather a reliability problem caused by an assertion that the connection is deterministically cleaned up when it is no longer in use. The simple fix of removing the [unnecessary] finalizer would address the reliability problem, but this goes a step further to address the underlying cause of non-deterministic release. |
|
since I am using external lib, I am not sure how I can guarantee I am not leaking anything when I dont dispose disposable resource. also, It will definitively not decrease ref count on shared connection. so it is not just about debug assert. |
|
@heejaechang @jinujoseph I'm OK with taking the code fix in 15.8, but if we're going to do that we should get this assert disabled in 15.7 pronto, as this is one of the top flaky issues right now. |
|
@jinujoseph should I create new PR that simply remove the debug assert for 15.7? |
|
yes , lets remove the assert from 15.7 |
Customer scenario
While users are using VS. Connections to servicehub can be leaked and number of connections get increased overtime.
Bugs this fixes
#26183
Workarounds, if any
there is no workaround
Risk
I don't see any risk
Performance impact
It should remove connection leaks.
Is this a regression from a previous update?
No
Root cause analysis
when we create certain type of connection, if cancellation happens at right time, we didn't dispose connection before exception escapes current call.
How was the bug found?
dogfooding and unit testing.