Socket: Linux: unblock synchronous socket operations on Dispose#37486
Socket: Linux: unblock synchronous socket operations on Dispose#37486davidsh merged 47 commits intodotnet:masterfrom
Conversation
src/System.Net.Sockets/src/System/Net/Sockets/SafeSocketHandle.Unix.cs
Outdated
Show resolved
Hide resolved
src/System.Net.Sockets/src/System/Net/Sockets/SafeSocketHandle.Unix.cs
Outdated
Show resolved
Hide resolved
src/System.Net.Sockets/src/System/Net/Sockets/SafeSocketHandle.Unix.cs
Outdated
Show resolved
Hide resolved
src/System.Net.Sockets/src/System/Net/Sockets/SafeSocketHandle.Windows.cs
Outdated
Show resolved
Hide resolved
|
Previously when @geoffkizer and I had talked about solving this general issue, we'd outlined a general plan as follows.
The goal here is to ensure that all pending and all future operations are canceled, and doing things in the "right" order such that we avoid race conditions where as we're executing this logic additional requests are being made. It looks like this PR is doing some of this, along with some specialized tweaks for Linux for 2a. Would it make sense to try to do more of this in this PR, or in soon-to-follow PRs? With all of that in place, I then want to see us do two more things:
|
src/System.Net.Sockets/src/System/Net/Sockets/SafeSocketHandle.cs
Outdated
Show resolved
Hide resolved
@stephentoub Thanks for the info, it makes sense. I'm focusing on figuring out what 2a should be for Linux and get some confidence it works well. We can then do the other parts in this or subsequent PRs. |
|
From CI results, on Windows: Dispose/Close with on-going operations causes an abortive close ('Connection reset by peer'). I was assuming this would be a FIN close, which is why the tests are failing on Windows. The abortive close can be generated on Linux by doing the |
This is a regression and it will be fixed by this patch: https://lists.openwall.net/netdev/2019/05/08/102 |
|
On Linux this is working well. |
|
/azp run |
|
Azure Pipelines successfully started running 4 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 4 pipeline(s). |
|
@tmds Can you please fix up the merge conflicts? |
|
/azp run |
|
Azure Pipelines successfully started running 4 pipeline(s). |
|
CI failures:
|
|
@wtgodbe do you have cycles to take a look at: Most likely issue is that one of the streams (or something else using sockets) gets disposed sooner than test is prepared for. |
|
I can probably take a look either today or Monday. Assert failure is at |
|
thanks! possible it's unrelated 😄 |
|
@wfurt has done an analysis and it seems that this PR has unintentionally affected async Socket I/O. There are cases where it causes RST instead of FIN to be sent. A full write-up will be forthcoming. For now, we are going to revert this PR for now. |
|
We reverted this PR. See #38765 for analysis. See issue #38750 for additional feedback when redoing this PR. |
…et#37486) * Socket: Linux: unblock synchronous socket operations on Dispose * Fix windows build * TryUnblockSocket: always do an abortive close for TCP * TryUnblockSocket: experiment, always call Shutdown * SyncTcpReceiveSendGetsCancelledByDisposeOrClose: Don't check for abortive close * Fix build * Revert always shutdown to compare CI results * PR feedback * TcpReceiveSendGetsCanceledByDisposeOrClose: fix timing race * TcpReceiveSendGetsCanceledByDisposeOrClose: fix timing race II * Fix test * TcpReceiveSendGetsCanceledByDisposeOrClose: fix timing race III * TcpReceiveSendGetsCanceledByDisposeOrClose: only run on Sync * Also perform abortive close when canceling async send/receive * Fix aborted set to true * Check what fails on OSX with Shutdown * Revert "Check what fails on OSX with Shutdown" This reverts commit 5ab7fce. * Don't perform abortive close for operations that were already canceled * interrup Accept/Listen on OSX * Find out what Disconnectx does for different tests on CI machines * Move Linux connect AF_UNSPEC and OSX disconnectx into pal Disconnect function, update tests for OSX * Trigger CI * TcpReceiveSendGetsCanceledByDispose: check SocketErrorCode * TcpReceiveSendGetsCanceledByDispose: OSX is also performing an RST close for disconnectx * Revert "TcpReceiveSendGetsCanceledByDispose: OSX is also performing an RST close for disconnectx" This reverts commit a8d5b02. * TcpReceiveSendGetsCanceledByDispose: fix test for OSX * call CancelIoEx to kick out pending operations * Socket interop: replace IntPtr with SafeHandle * Windows: RST close when operations were canceled * Extend tests to check local SocketError * test: update expected SocketErrors * update tests: Apm API throws ObjectDisposedException instead of SocketException * Extend asserts * Update expected result for ConnectTask * Update tests based on Windows behavior (dotnet#37706) * Map SocketErrors in Socket.cs * Rewrite nullable asserts * Fix test for Mac * Handle SendFile SocketError * SyncSendFileGetsCanceledByDispose: change how we generate the large file * SyncSendFileGetsCanceledByDispose: ensure unique filename * Fix NETFX build * SyncSendFileGetsCanceledByDispose: skip on NetFramework * PR feedback * Fix broken build * Perform DangerousAddRef for Socket used with UpdateAcceptContext socket option
…se (dotnet#37486)" (dotnet#38764) This reverts commit 3fac6ad.
…et/corefx#37486) * Socket: Linux: unblock synchronous socket operations on Dispose * Fix windows build * TryUnblockSocket: always do an abortive close for TCP * TryUnblockSocket: experiment, always call Shutdown * SyncTcpReceiveSendGetsCancelledByDisposeOrClose: Don't check for abortive close * Fix build * Revert always shutdown to compare CI results * PR feedback * TcpReceiveSendGetsCanceledByDisposeOrClose: fix timing race * TcpReceiveSendGetsCanceledByDisposeOrClose: fix timing race II * Fix test * TcpReceiveSendGetsCanceledByDisposeOrClose: fix timing race III * TcpReceiveSendGetsCanceledByDisposeOrClose: only run on Sync * Also perform abortive close when canceling async send/receive * Fix aborted set to true * Check what fails on OSX with Shutdown * Revert "Check what fails on OSX with Shutdown" This reverts commit dotnet/corefx@5ab7fce. * Don't perform abortive close for operations that were already canceled * interrup Accept/Listen on OSX * Find out what Disconnectx does for different tests on CI machines * Move Linux connect AF_UNSPEC and OSX disconnectx into pal Disconnect function, update tests for OSX * Trigger CI * TcpReceiveSendGetsCanceledByDispose: check SocketErrorCode * TcpReceiveSendGetsCanceledByDispose: OSX is also performing an RST close for disconnectx * Revert "TcpReceiveSendGetsCanceledByDispose: OSX is also performing an RST close for disconnectx" This reverts commit dotnet/corefx@a8d5b02. * TcpReceiveSendGetsCanceledByDispose: fix test for OSX * call CancelIoEx to kick out pending operations * Socket interop: replace IntPtr with SafeHandle * Windows: RST close when operations were canceled * Extend tests to check local SocketError * test: update expected SocketErrors * update tests: Apm API throws ObjectDisposedException instead of SocketException * Extend asserts * Update expected result for ConnectTask * Update tests based on Windows behavior (dotnet/corefx#37706) * Map SocketErrors in Socket.cs * Rewrite nullable asserts * Fix test for Mac * Handle SendFile SocketError * SyncSendFileGetsCanceledByDispose: change how we generate the large file * SyncSendFileGetsCanceledByDispose: ensure unique filename * Fix NETFX build * SyncSendFileGetsCanceledByDispose: skip on NetFramework * PR feedback * Fix broken build * Perform DangerousAddRef for Socket used with UpdateAcceptContext socket option Commit migrated from dotnet/corefx@3fac6ad
…se (dotnet/corefx#37486)" (dotnet/corefx#38764) This reverts commit dotnet/corefx@3fac6ad. Commit migrated from dotnet/corefx@42ecf71
This makes Socket.Dispose on Linux more similar as Dispose on Windows: on-going synchronous operations are stopped. The implementation works on Linux, but isn't guaranteed to work on other Unixes.
based on #32087
Fixes #22564
Fixes #26034
CC @wfurt @stephentoub @geoffkizer @davidsh