Fix a shutdown race condition in ControlCore#18632
Conversation
5302525 to
45d2519
Compare
| _outputRevoker = wrappedConnection.TerminalOutput(winrt::auto_revoke, { this, &DebugTapConnection::_OutputHandler }); | ||
| _stateChangedRevoker = wrappedConnection.StateChanged(winrt::auto_revoke, [this](auto&& /*s*/, auto&& /*e*/) { | ||
| StateChanged.raise(*this, nullptr); | ||
| _outputRevoker = wrappedConnection.TerminalOutput(winrt::auto_revoke, { get_weak(), &DebugTapConnection::_OutputHandler }); |
There was a problem hiding this comment.
get_weak() to ensure pending calls during revocation can complete without this being deallocated.
There was a problem hiding this comment.
does this by chance fix the issue where closing a connection while the debug tap is on it crashes terminal
There was a problem hiding this comment.
I wasn't aware we had such an issue. It's quite likely that this fixes it.
| _connectionStateChangedRevoker = newConnection.StateChanged(winrt::auto_revoke, [this](auto&& /*s*/, auto&& /*v*/) { | ||
| ConnectionStateChanged.raise(*this, nullptr); | ||
| }); | ||
| _connectionStateChangedRevoker = newConnection.StateChanged(winrt::auto_revoke, { get_weak(), &ControlCore::_connectionStateChangedHandler }); |
There was a problem hiding this comment.
get_weak() to ensure pending calls during revocation can complete without this being deallocated. (Same below.)
There was a problem hiding this comment.
Fuck. I just realized this can't work because it allows this to be destructed on a background thread which breaks WinUI. Got to use a lock or something instead.
| // TODO: This manual event revoking doesn't make much sense. | ||
| // We could just drop the old connection. Why have all that Close() stuff? | ||
| // It also shouldn't need to be exposed to the outside. I suspect we can only | ||
| // improve this though, once drag/drop of tabs doesn't use "startup actions" anymore. |
There was a problem hiding this comment.
(For future code archealogists.)
| // First create the render thread. | ||
| // Then stash a local pointer to the render thread so we can initialize it and enable it | ||
| // to paint itself *after* we hand off its ownership to the renderer. | ||
| // We split up construction and initialization of the render thread object this way | ||
| // because the renderer and render thread have circular references to each other. | ||
| auto renderThread = std::make_unique<::Microsoft::Console::Render::RenderThread>(); | ||
| auto* const localPointerToThread = renderThread.get(); |
There was a problem hiding this comment.
This has always been a thorn in my eye, so I fixed it in this PR by moving the RenderThread into the Renderer. No more ownership issues.
|
|
||
| _renderer.reset(); | ||
| _renderEngine.reset(); |
There was a problem hiding this comment.
I removed this, because I fixed the member order of the class.
src/renderer/base/renderer.cpp
Outdated
| { | ||
| AddRenderEngine(rgpEngines[i]); | ||
| } | ||
| THROW_IF_FAILED(_pThread->Initialize(this)); |
There was a problem hiding this comment.
I always forget... is this guaranteed to be usable in the constructor? is Initialize safe for this if this is partially constructed?
There was a problem hiding this comment.
this is a valid pointer in the constructor, it's just that the members aren't necessarily fully initialized yet.
|
I know the tests probably interact with RenderThread in weird ways. |
dc45cbb to
8fe4350
Compare
DHowett
left a comment
There was a problem hiding this comment.
Okay. I think I need you to explain how moving from the current architecture where we have "is the thread allowed to spin" and "does the thread need to actually produce a frame" is OK, and what implications all of this simplification should have on the application. I can see that there's real issues fixed here, but it is a big change in a touchy area.
| void ControlCore::_sendInputToConnection(std::wstring_view wstr) | ||
| { | ||
| _connection.WriteInput(winrt_wstring_to_array_view(wstr)); | ||
| if (_connection) |
There was a problem hiding this comment.
does [[likely]] make a difference here? ergo, is input slower because of the branch? is that too much of a micro-optimization?
There was a problem hiding this comment.
No, it won't. likely/unlikely rarely make a difference, but if they do it's primarily in tight code.
|
|
||
| // ↑ This one is tricky - all of these are raw pointers: | ||
| // 1. _terminal depends on _renderer (for invalidations) | ||
| // 2. _renderer depends on _terminal (for IRenderData) |
There was a problem hiding this comment.
the comments below say 3 3 and 1. there is no 2.
| if (--tries == 0) | ||
| { | ||
| // Stop trying. | ||
| _pThread->DisablePainting(); |
There was a problem hiding this comment.
It's way safer if this function simply signals the end of painting to the renderer via a return value. This reminds me that I didn't test this, shame on me.
|
I have good news: this PR fixes another bug where restarting a connection would fail to terminate the previous one. (It would leak the connection by not calling Close on it, since it is unfortunately self-referential.) Regarding why all this is necessary: I realized during development that the entire logic around |
I found multiple issues while investigating this: * Render thread shutdown is racy, because it doesn't actually stop the render thread. * Lifetime management in `ControlCore` failed to account for the circular dependency of render thread --> renderer --> render data --> terminal --> renderer --> render thread. Fixed by reordering the `ControlCore` members to ensure their correct destruction. * Ensured that the connection setter calls close on the previous connection. (Hopefully) Closes #18598 ## Validation Steps Performed * Can't repro the original failure ❌ * Opening and closing tabs as fast as possible doesn't crash anymore ✅ * Detaching and reattaching a tab producing continuous output ✅ (cherry picked from commit 70f85a4) Service-Card-Id: PVTI_lADOAF3p4s4AmhmQzgXhFbo Service-Version: 1.22
I found multiple issues while investigating this: * Render thread shutdown is racy, because it doesn't actually stop the render thread. * Lifetime management in `ControlCore` failed to account for the circular dependency of render thread --> renderer --> render data --> terminal --> renderer --> render thread. Fixed by reordering the `ControlCore` members to ensure their correct destruction. * Ensured that the connection setter calls close on the previous connection. (Hopefully) Closes #18598 ## Validation Steps Performed * Can't repro the original failure ❌ * Opening and closing tabs as fast as possible doesn't crash anymore ✅ * Detaching and reattaching a tab producing continuous output ✅ (cherry picked from commit 70f85a4) Service-Card-Id: PVTI_lADOAF3p4s4AxadtzgYdERw Service-Version: 1.23
I found multiple issues while investigating this:
ControlCorefailed to account for the circular dependency of render thread --> renderer --> render data --> terminal --> renderer --> render thread. Fixed by reordering theControlCoremembers to ensure their correct destruction.(Hopefully) Closes #18598
Validation Steps Performed