Fix incorrect task chaining in InlineRenameSession#34254
Fix incorrect task chaining in InlineRenameSession#34254sharwell merged 9 commits intodotnet:masterfrom
Conversation
There was a problem hiding this comment.
📝 QueueApplyReplacements is chained on _allRenameLocationsTask, but depends on data produced by RaiseSessionSpansUpdated. If the tasks are reordered (e.g. if a Task.Delay is inserted above before RaiseSessionSpansUpdated), the data will be processed incorrectly and lead to a failure.
There was a problem hiding this comment.
i'm not understanding. if QueueApplyReplacements is chained on _allRenameLocationsTask, then shouldn't it only run when _allRenameLocationsTask completes? And wouldn't htat only complete once that lambda completes? Which only finishes after RaiseSessionSpansUpdated runs and finishes? Even if something caused a delay before/after RaiseSessionSpansUpdated what would change?
Can you clarify your explanation?
There was a problem hiding this comment.
And wouldn't htat only complete once that lambda completes?
You are confusing _allRenameLocationsTask (the field) with allRenameLocationsTask (the local variable). 😉
There was a problem hiding this comment.
oh boy. i'll reread, but that certainly is a confusing way to name things.
jasonmalinowski
left a comment
There was a problem hiding this comment.
There is a deadlock introduced by this approach.
Pull request changed since original problem.
QueueApplyReplacements is chained on _allRenameLocationsTask, but depends on data produced by RaiseSessionSpansUpdated. If the tasks are reordered (e.g. if a Task.Delay is inserted above before RaiseSessionSpansUpdated), the data will be processed incorrectly and lead to a failure. Fixes dotnet#34221
|
@dotnet/roslyn-ide this could help with test reliability. Could we get this reviewed? |
JoeRobich
left a comment
There was a problem hiding this comment.
Looks good to me, but I can't speak for the JTF usage.
jasonmalinowski
left a comment
There was a problem hiding this comment.
So I agree this does fix the task chaining problem, but I regret it doesn't actually simplify all the crazy state in the first place. If nothing else, can we add a comment or two at the points where we're updating the chain to describe why it's important to have the ordering? I'm happy to sign off if this is done.
I'm also worried about the perf impact of this: previously, it was OK for you to start an inline rename, type something, and hit enter really fast, the task chain is directly computing the original locations, then the conflict resolution, and then will go and apply that. This is now sticking in the middle of that the UI update for something the user won't ever actually see. I don't know how often that happens of course.
This comment has been minimized.
This comment has been minimized.
a1a9139 to
578e20e
Compare
@jasonmalinowski This is now done. I also corrected an incorrect cancellation token use, with documentation to clarify why |
|
@dotnet/roslyn-ide for review. |
jasonmalinowski
left a comment
There was a problem hiding this comment.
Not understanding why we're trying to maintain the LazyCancellation behavior -- it seems like it's making the conversion more complicated than necessary...
| UpdateReferenceLocationsTask(ThreadingContext.JoinableTaskFactory.RunAsync(async () => | ||
| { | ||
| // Join prior work before proceeding, since it performs a required state update. | ||
| // https://github.com/dotnet/roslyn/pull/34254#discussion_r267024593 |
There was a problem hiding this comment.
Not sure if it's worth inlining the explanation rather than relying on GitHub not breaking a URL like that....
| // | ||
| // The cancellation token is passed to the prior work when it starts, not when it's joined. This is | ||
| // the equivalent of TaskContinuationOptions.LazyCancellation. | ||
| await _allRenameLocationsTask.JoinAsync(CancellationToken.None).ConfigureAwait(false); |
There was a problem hiding this comment.
Why wouldn't this just pass the cancellationToken? I think you're trying to preserve behavior, but do we need to?
(I never liked that LazyCancellation was forced on everybody using SafeContinueWith who didn't perhaps need it because it means we would be potentially cancelling things a bit later.)
There was a problem hiding this comment.
A behavior change here requires proof that the newly-allowed concurrency would not result in state corruption.
There was a problem hiding this comment.
FindRenameLocationsAsync already can be running multiple times in parallel if for example options were changed in rapid succession multiple times. Either it's already safe for concurrency (and it's not a concern) or it's not, and then this is still bugged.
There was a problem hiding this comment.
I was not able to prove that this one would behave correctly with lazy cancellation. With that said, this cancellation token only gets cancelled when the rename session is cancelled or committed, so it won't be impacting typing performance. If/when it is cancelled, the remaining tasks will quickly cancel themselves after the currently executing task finishes (by cancelling).
There was a problem hiding this comment.
With that said, this cancellation token only gets cancelled when the rename session is cancelled or committed, so it won't be impacting typing performance.
Ah, maybe this is why I was confused earlier: this code has more problems than what you're trying to fix. 😄 If the options are changed, we should have entirely cancelled off the old work because there's no reason for that to continue running. What this should be doing (loosely) is cancelling _cancellationTokenSource and creating a new one before it calls UpdateReferenceLocationsTask. It appears this predates the open sourcing of this so I don't know why this ended up this way.
Want to file a bug? It's not that hard to fix but requires some care and I don't blame you for not wanting to fix it in this PR. (Although you're welcome to!)
|
@jasonmalinowski to re-review the final updates |
jasonmalinowski
left a comment
There was a problem hiding this comment.
Looks good but I think I understand some of our confusion earlier: there's an (existing) perf issue where an option change should be cancelling off the old reference location search but isn't. Please file a bug for that while closing this, as that's a very not good bug.
| await ThreadingContext.JoinableTaskFactory.SwitchToMainThreadAsync(alwaysYield: true, _cancellationTokenSource.Token); | ||
| _cancellationTokenSource.Token.ThrowIfCancellationRequested(); | ||
|
|
||
| RaiseSessionSpansUpdated(inlineRenameLocations.Locations.ToImmutableArray()); |
There was a problem hiding this comment.
Can we get a comment here saying it's unfortunate that _allRenameLocationsTask now requires the UI thread before it can let continuations run? Don't get me wrong: this is correctly expressing how it has to work right now, but it's still not how it was supposed to work.
| // | ||
| // The cancellation token is passed to the prior work when it starts, not when it's joined. This is | ||
| // the equivalent of TaskContinuationOptions.LazyCancellation. | ||
| await _allRenameLocationsTask.JoinAsync(CancellationToken.None).ConfigureAwait(false); |
There was a problem hiding this comment.
With that said, this cancellation token only gets cancelled when the rename session is cancelled or committed, so it won't be impacting typing performance.
Ah, maybe this is why I was confused earlier: this code has more problems than what you're trying to fix. 😄 If the options are changed, we should have entirely cancelled off the old work because there's no reason for that to continue running. What this should be doing (loosely) is cancelling _cancellationTokenSource and creating a new one before it calls UpdateReferenceLocationsTask. It appears this predates the open sourcing of this so I don't know why this ended up this way.
Want to file a bug? It's not that hard to fix but requires some care and I don't blame you for not wanting to fix it in this PR. (Although you're welcome to!)
Fixes #29483