Batch up and flatten diagnostic updates.#23377
Batch up and flatten diagnostic updates.#23377CyrusNajmabadi wants to merge 20 commits intodotnet:masterfrom
Conversation
|
tagging @heejaechang @sharwell @jasonmalinowski |
| // | ||
| // Because we're batching, we can optimize things such that we only store two pieces | ||
| // of data per provider. Specifically, if they were removing all diagnostics, and | ||
| // then the last diagnostics they wanted to create. |
There was a problem hiding this comment.
note: it may be possible to optimize things even further. i.e. just store the last DiagnosticsUpdateArgs for a provider. However, i wasn't sure if that would end up breaking anything. i.e. it's not clear to me if there are circumstances where we depend on processing the 'remove' before the 'create'. So i wanted to make the safest change possible.
e02bb7f to
9f3185a
Compare
sharwell
left a comment
There was a problem hiding this comment.
From the information I was able to determine through heap dumps (with notable limitations due to available tooling in that area), I believe batching represents a more direct solution to the original problem identified by MemoryWatson. In other words, I have more confidence in this approach than the approach in #22920 for the specific purpose of ensuring the UI thread processing queue does not grow larger than necessary. However, the current approach in this PR has a few downsides so I'd like to look into understanding their impact and considering alternatives that address the original and new concerns.
What I like:
- This pull request removes old (superseded) events from the processing queue if the UI thread has not already started handling them.
What I don't like:
- This pull request substitutes the use of
IForegroundNotificationServicefor task scheduling. The service provides an advantage due to time-boxed incremental processing. In terms of the current pull request, this is roughly equivalent to having aYield()in the inner loop ofOnDiagnosticsUpdatedOnForeground.
What I think we can improve on in the future:
- I'd like to find a way to remove the
Task.Delay(50). We want batching to occur when items are building in the queue, but I don't see a particular advantage to inserting a processing delay to "help" this.
| batchedUpdates.removeArgs = e; | ||
|
|
||
| // Clear all all creates for this provider. | ||
| batchedUpdates.createArgs.Clear(); |
There was a problem hiding this comment.
❗️ This logic isn't correct¹: DiagnosticsRemoved can be specific to a particular project or document. We need to do two things here:
-
We need to modify data structures as necessary to implement the equivalent of this:
batchedUpdates.removeArgs.UnionWith(e);
-
We need an efficient way to only clear items from
createArgsthat overlap withe(and I meaneas opposed toremoveArgs).
¹ It's possible the logic is correct based on information which was not obvious to me at the time of review.
There was a problem hiding this comment.
This is true. While we remove all diagnostics when we get a DiagnosticsRemoved request:
There was a problem hiding this comment.
You're right. While we do remove all diagnostics when we get a DiagnosticsRemoved request:
We only do this for a removed request that matches our document id. I need to ensure the latter.
There was a problem hiding this comment.
It's possible the logic is correct based on information which was not obvious to me at the time of review.
Nope. You're correct. I missed something. Removal notifications for other documents would cause me to remove the diagnostics for our document.
| // We're creating diagnostics. This supercedes all existing creations for | ||
| // this id. However, any existing removal for this id will still happen. | ||
| var document = e.Solution.GetDocument(e.DocumentId); | ||
| batchedUpdates.createArgs[document] = e; |
There was a problem hiding this comment.
❓ Why not use DocumentId as the key here? The use of Document means entire solutions could end up rooted in the work queue.
There was a problem hiding this comment.
it's rooted no matter what. e holds onto a solution. So Document was more natural here, without changing what is rooted.
This is true. However due to the work i've done here that time boxing-incremental-processing is almost certainly unnecessary. Due to the collapsing of work that i've done, when we wake up we will have at most two actual pieces of work to do. The removal of diagnostics and the creation of diagnostics. This is because we are attached to a single document, and so the majority of notifications we hear about will simply be thrown away immediately (in a manner of microseconds) when we compare our workspace+docId against the workspace+docId in the event. Because we only have two pieces of work to do, doing work incrementally isn't really necessary as no single piece of work should be substantial enough to warrant yielding. |
Note: the FGNotificationService has a 50ms delay between processing batches of items. I was preserving that here. Note that I'd like at least some delay that balances not being noticeable, while still likely catching a huge burst of activity from the diagnostics service. 50ms seems to fit the bill, and was not really a problem in the old system. |
|
@sharwell ready for another pass. |
Ah no. That's not quite accurate. We have two pieces of work to do per provider that reported events for this document. @heejaechang How many different providers are there? Is there like a provider per analyzer? (so unbounded in length) Or is it more like just a small handful of providers (i.e. one for the compiler, one for all analyzers, etc.). Thanks! |
|
At sharwell, i can add the "time-boxed incremental processing" back in if necessary. Though i'm skeptical it will be necessary. Note that very little happens when we actually do processing. All we basically do is go through the list of work (throwing out things unrelated ot our document). For work items that apply, all that happens is that the underlying AbstractDiagnosticsTaggerProvider.TaggerProvider simply stores the data and then notifies the TagSource that it changed. The TagSource then in turn does nothing but register a callback to process those tags at some delay in the future (the delay is determined by the type of the diagnostics tag). So, basically, we do very little work on the UI thread. |
|
Lol. @sharwell not "at sharwell" :D |
| { | ||
| var token = _owner._listener.BeginAsyncOperation(GetType().Name + "OnDiagnosticsUpdatedOnBackground"); | ||
|
|
||
| _updateTask = Task.Delay(50).ContinueWith( |
There was a problem hiding this comment.
what cancel previous update task if same document changed twice?
There was a problem hiding this comment.
I see so this is actually batching by updating every 50ms
There was a problem hiding this comment.
what cancel previous update task if same document changed twice?
No need to cancel. When the task actually executes, it will handle all the notifications that came in between now and then.
There was a problem hiding this comment.
yep. I thought you were cancelling previous task and enqueue new one so that it gets delayed again. but this is just putting 50ms buffer on each update.
There was a problem hiding this comment.
this still don't cancel update on tagger dispose though.
There was a problem hiding this comment.
That's true. But worse case that means we just process one more request within 50 ms after tagger dispose. this request will see htat _dispose is set and will immediately bail. Basically, we have to ask ourselves how important is it to cancel a single request.
| { | ||
| this.AssertIsBackground(); | ||
| RegisterNotification(() => OnDiagnosticsUpdatedOnForeground(e)); | ||
| if (_disposed) |
There was a problem hiding this comment.
shouldn't this under lock?
There was a problem hiding this comment.
No need for lock here. At worst, we miss some write of 'true' to disposed. But then the UI request that executes will immediately bail.
There was a problem hiding this comment.
Note: even under a lock we could 'still' miss things, as this field is always racey if accessed from different threads. The only important thing is that UI calls respect it and don't do work if it is set to true. the BG just uses it when it can to avoid unnecessary work, but it's ok to miss.
| // care about. | ||
| var ourDocument = _subjectBuffer.AsTextContainer().GetOpenDocumentInCurrentContext(); | ||
| var ourDocumentId = ourDocument?.Id; | ||
| if (ourDocumentId != _currentDocumentId) |
There was a problem hiding this comment.
I would prefer doing this in BG by listening to all proper workspace events on buffer/workspace changes notifications. if those events are broken, we should fix those events.
There was a problem hiding this comment.
We could do that as a later change. It will make things more complex as we'll have to serialize out our BG requests and filter them against the requests to update _currentDocId. i'm ok doing that as part of a followup PR.
There was a problem hiding this comment.
Note: can you tell me precisely which Workspace/buffer changes I would need to listen to?
There was a problem hiding this comment.
ActiveContextChange and WorkspaceChanged event.
|
@heejaechang Can you address this part: #23377 (comment) |
|
each analyzer can have 3 different events (1 syntax, 1 semantic, 1 semantic on other files). so in current design, 1 analyzer can have 3 taggers. in reality, except compiler analyzer which do all these 3, most of user analyzer produce only 1 kind of diagnostics (syntax, semantic, or 1 semantic on other files ex, compilation end analyzer) so we can assume 1 tagger per 1 analyzers most of time. |
|
Hrmm.. that doesn't really address what i'm looking for. Let me rephrase. DiagnosticUpdateArgs has an 'object Id' property in it. Normally, how many unique different Ids will there be? A fixed handful? Or is it unbounded and could be hundreds or more? i.e. Are there unique Ids per user analyzer in the system? i.e. some for Add-Using, some for Generate Method, etc. etc. Or is it more like just a couple of Ids for compiler diagnostics. Just a couple for all analyzers combined, etc. etc. I need to know this to determine if we actually need some sort of yielding approach in the OnDiagnosticsUpdatedOnForeground loop, or if we can leave as is. If the number of DiagnosticUpdateArgs.Id's is low/bounded, we can keep as is. Otherwise, we may need to have this yield, and move to processing hte remainder of work in a later timeslice. Or, i can move this to the BG. I'll try that in another PR. Thanks! |
That's not answering my question :) My question is: around how many unique 'Ids' are there in a normal project? Is it a handful? Or is it unbounded? |
No, that's not true. I have filtering on the UI thread in this PR without the need for excessive lambda allocations. So there is no reason that UI thread filtering means excess allocations. |
|
object id is unique per analyzer, per kind, per document/project, per vsix/nuget analyzer came from. there will be at least as many object id as number of analyzer, documents, projects in the solution. |
Ok. So it's effectively unbounded. You're telling me that in something like Roslyn, there would be thousands of unique Ids, yes? |
I don't know what that means. Are there a lot of analyzers? Or are there a fixed number. For example, do you have one analyzer that handles thigns like Add-Using/Generate-Method, or is it an analyzer per each of these? |
|
you can remove the excessive lambda allocations by filtering on BG? since it is the tagger that create and enqueue that many lambda when it doesn't need to. |
And i can remove it on the FG.
Fixed, as per this PR. |
|
we don't know how many analyzers there will be. it is the diagnostic analyzer we have been making framework for. user can install as many analyzers as they wish. we do not put limit on how many analyzers they can install to VS or solution. okay, so at most, number of object Id you can get is number of diagnostic analyzers installed (vsix or nuget) * 3 kinds (syntax, semantic, semantic on other files) * number of (documents + projects in the solution) * number of (analyzer vsix * nugets installed) |
|
sure, you can remove it on FG, but once you do filtering on BG, you don't need to do it on FG since there won't be as many FG requests as before... so, you don't need to have dictionaries to hold onto batched requests or the delay and etc. |
|
we wanted to have 1 tagger for 1 analyzers. but current editor tagger doesn't allow dynamic tagger insertion/deletion once view is created. so we made this aggregating tagger concept and created this nested tagger per 1 analyzers I believe. which we can dynamically add and remove, while we give this aggregated tagger to the editor. I am not 100% sure whether that 1 tagger per analyzer deal with all different events from that analyzer or we have 1 tagger per each different type of analyzer events (aka object id) |
|
I have a followup PR here that moves filtering to the BG as well: #23409 |
|
Superseded with #23409 |
We have one per Id. So it's whatever granularity the diagnostics service creates Ids for. We can't do any better on our end as that's all the information we're given from DiagnosticUpdateArgs |
|
@CyrusNajmabadi I'm more confident with #23409, and would prefer to focus the validation on that change. |
|
@sharwell That's fine. I just wanted to give you guys options depending on your risk aversion. |
cf3d1d1 to
84d6ed9
Compare
|
Closing in favor of: #23448 |
In response to #22920 (comment)
This is a tweak to the diagnostic taggers to improve how they respond to a flood of notifications from the underlying diagnostic services. There are a few improvements as part of this PR.