Skip to content

Batch up and flatten diagnostic updates.#23377

Closed
CyrusNajmabadi wants to merge 20 commits intodotnet:masterfrom
CyrusNajmabadi:bulkProcessDiagnostics
Closed

Batch up and flatten diagnostic updates.#23377
CyrusNajmabadi wants to merge 20 commits intodotnet:masterfrom
CyrusNajmabadi:bulkProcessDiagnostics

Conversation

@CyrusNajmabadi
Copy link
Copy Markdown
Contributor

In response to #22920 (comment)

This is a tweak to the diagnostic taggers to improve how they respond to a flood of notifications from the underlying diagnostic services. There are a few improvements as part of this PR.

  1. We batch up diagnostic notifications over a 50ms window and process them all at once, instead of enqueing and processing each notification on the UI thread individually.
  2. Because we're batching, we can do a lot of preprocessing of the notifications to throw out information that has been superceded. For example, if we get a notification that we're actually removing all the diagnostics for a provider, we can just ignore all previous diagnostic-creates that we haven't processed yet. Also, any more diagnostic creations we hear about will supercede previous diagnostic creations for that provider.

@CyrusNajmabadi
Copy link
Copy Markdown
Contributor Author

tagging @heejaechang @sharwell @jasonmalinowski

//
// Because we're batching, we can optimize things such that we only store two pieces
// of data per provider. Specifically, if they were removing all diagnostics, and
// then the last diagnostics they wanted to create.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: it may be possible to optimize things even further. i.e. just store the last DiagnosticsUpdateArgs for a provider. However, i wasn't sure if that would end up breaking anything. i.e. it's not clear to me if there are circumstances where we depend on processing the 'remove' before the 'create'. So i wanted to make the safest change possible.

Copy link
Copy Markdown
Contributor

@sharwell sharwell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the information I was able to determine through heap dumps (with notable limitations due to available tooling in that area), I believe batching represents a more direct solution to the original problem identified by MemoryWatson. In other words, I have more confidence in this approach than the approach in #22920 for the specific purpose of ensuring the UI thread processing queue does not grow larger than necessary. However, the current approach in this PR has a few downsides so I'd like to look into understanding their impact and considering alternatives that address the original and new concerns.

What I like:

  • This pull request removes old (superseded) events from the processing queue if the UI thread has not already started handling them.

What I don't like:

  • This pull request substitutes the use of IForegroundNotificationService for task scheduling. The service provides an advantage due to time-boxed incremental processing. In terms of the current pull request, this is roughly equivalent to having a Yield() in the inner loop of OnDiagnosticsUpdatedOnForeground.

What I think we can improve on in the future:

  • I'd like to find a way to remove the Task.Delay(50). We want batching to occur when items are building in the queue, but I don't see a particular advantage to inserting a processing delay to "help" this.

batchedUpdates.removeArgs = e;

// Clear all all creates for this provider.
batchedUpdates.createArgs.Clear();
Copy link
Copy Markdown
Contributor

@sharwell sharwell Nov 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❗️ This logic isn't correct¹: DiagnosticsRemoved can be specific to a particular project or document. We need to do two things here:

  1. We need to modify data structures as necessary to implement the equivalent of this:

    batchedUpdates.removeArgs.UnionWith(e);
  2. We need an efficient way to only clear items from createArgs that overlap with e (and I mean e as opposed to removeArgs).

¹ It's possible the logic is correct based on information which was not obvious to me at the time of review.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is true. While we remove all diagnostics when we get a DiagnosticsRemoved request:

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. While we do remove all diagnostics when we get a DiagnosticsRemoved request:

var id = e.Id;
if (!_idToProviderAndTagger.TryGetValue(id, out var providerAndTagger))
{
// Wasn't a diagnostic source we care about.
return;
}
_idToProviderAndTagger.Remove(id);
DisconnectFromTagger(providerAndTagger.tagger);
OnUnderlyingTaggerTagsChanged(this, new SnapshotSpanEventArgs(_subjectBuffer.CurrentSnapshot.GetFullSpan()));

We only do this for a removed request that matches our document id. I need to ensure the latter.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#fixed

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible the logic is correct based on information which was not obvious to me at the time of review.

Nope. You're correct. I missed something. Removal notifications for other documents would cause me to remove the diagnostics for our document.

// We're creating diagnostics. This supercedes all existing creations for
// this id. However, any existing removal for this id will still happen.
var document = e.Solution.GetDocument(e.DocumentId);
batchedUpdates.createArgs[document] = e;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ Why not use DocumentId as the key here? The use of Document means entire solutions could end up rooted in the work queue.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's rooted no matter what. e holds onto a solution. So Document was more natural here, without changing what is rooted.

@CyrusNajmabadi
Copy link
Copy Markdown
Contributor Author

This pull request substitutes the use of IForegroundNotificationService for task scheduling. The service provides an advantage due to time-boxed incremental processing. In terms of the current pull request, this is roughly equivalent to having a Yield() in the inner loop of OnDiagnosticsUpdatedOnForeground.

This is true. However due to the work i've done here that time boxing-incremental-processing is almost certainly unnecessary. Due to the collapsing of work that i've done, when we wake up we will have at most two actual pieces of work to do. The removal of diagnostics and the creation of diagnostics. This is because we are attached to a single document, and so the majority of notifications we hear about will simply be thrown away immediately (in a manner of microseconds) when we compare our workspace+docId against the workspace+docId in the event.

Because we only have two pieces of work to do, doing work incrementally isn't really necessary as no single piece of work should be substantial enough to warrant yielding.

@CyrusNajmabadi
Copy link
Copy Markdown
Contributor Author

I'd like to find a way to remove the Task.Delay(50). We want batching to occur when items are building in the queue, but I don't see a particular advantage to inserting a processing delay to "help" this.

Note: the FGNotificationService has a 50ms delay between processing batches of items. I was preserving that here.

Note that I'd like at least some delay that balances not being noticeable, while still likely catching a huge burst of activity from the diagnostics service. 50ms seems to fit the bill, and was not really a problem in the old system.

@CyrusNajmabadi
Copy link
Copy Markdown
Contributor Author

@sharwell ready for another pass.

@CyrusNajmabadi
Copy link
Copy Markdown
Contributor Author

Because we only have two pieces of work to do,

Ah no. That's not quite accurate. We have two pieces of work to do per provider that reported events for this document. @heejaechang How many different providers are there? Is there like a provider per analyzer? (so unbounded in length) Or is it more like just a small handful of providers (i.e. one for the compiler, one for all analyzers, etc.).

Thanks!

@CyrusNajmabadi
Copy link
Copy Markdown
Contributor Author

At sharwell, i can add the "time-boxed incremental processing" back in if necessary. Though i'm skeptical it will be necessary. Note that very little happens when we actually do processing. All we basically do is go through the list of work (throwing out things unrelated ot our document). For work items that apply, all that happens is that the underlying AbstractDiagnosticsTaggerProvider.TaggerProvider simply stores the data and then notifies the TagSource that it changed. The TagSource then in turn does nothing but register a callback to process those tags at some delay in the future (the delay is determined by the type of the diagnostics tag).

So, basically, we do very little work on the UI thread.

@CyrusNajmabadi
Copy link
Copy Markdown
Contributor Author

Lol. @sharwell not "at sharwell" :D

{
var token = _owner._listener.BeginAsyncOperation(GetType().Name + "OnDiagnosticsUpdatedOnBackground");

_updateTask = Task.Delay(50).ContinueWith(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what cancel previous update task if same document changed twice?

Copy link
Copy Markdown
Contributor

@heejaechang heejaechang Nov 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see so this is actually batching by updating every 50ms

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what cancel previous update task if same document changed twice?

No need to cancel. When the task actually executes, it will handle all the notifications that came in between now and then.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep. I thought you were cancelling previous task and enqueue new one so that it gets delayed again. but this is just putting 50ms buffer on each update.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this still don't cancel update on tagger dispose though.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true. But worse case that means we just process one more request within 50 ms after tagger dispose. this request will see htat _dispose is set and will immediately bail. Basically, we have to ask ourselves how important is it to cancel a single request.

{
this.AssertIsBackground();
RegisterNotification(() => OnDiagnosticsUpdatedOnForeground(e));
if (_disposed)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this under lock?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for lock here. At worst, we miss some write of 'true' to disposed. But then the UI request that executes will immediately bail.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: even under a lock we could 'still' miss things, as this field is always racey if accessed from different threads. The only important thing is that UI calls respect it and don't do work if it is set to true. the BG just uses it when it can to avoid unnecessary work, but it's ok to miss.

// care about.
var ourDocument = _subjectBuffer.AsTextContainer().GetOpenDocumentInCurrentContext();
var ourDocumentId = ourDocument?.Id;
if (ourDocumentId != _currentDocumentId)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer doing this in BG by listening to all proper workspace events on buffer/workspace changes notifications. if those events are broken, we should fix those events.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do that as a later change. It will make things more complex as we'll have to serialize out our BG requests and filter them against the requests to update _currentDocId. i'm ok doing that as part of a followup PR.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: can you tell me precisely which Workspace/buffer changes I would need to listen to?

Copy link
Copy Markdown
Contributor

@heejaechang heejaechang Nov 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ActiveContextChange and WorkspaceChanged event.

@CyrusNajmabadi
Copy link
Copy Markdown
Contributor Author

@heejaechang Can you address this part: #23377 (comment)

@heejaechang
Copy link
Copy Markdown
Contributor

each analyzer can have 3 different events (1 syntax, 1 semantic, 1 semantic on other files). so in current design, 1 analyzer can have 3 taggers. in reality, except compiler analyzer which do all these 3, most of user analyzer produce only 1 kind of diagnostics (syntax, semantic, or 1 semantic on other files ex, compilation end analyzer) so we can assume 1 tagger per 1 analyzers most of time.

@CyrusNajmabadi
Copy link
Copy Markdown
Contributor Author

CyrusNajmabadi commented Nov 27, 2017

Hrmm.. that doesn't really address what i'm looking for. Let me rephrase. DiagnosticUpdateArgs has an 'object Id' property in it. Normally, how many unique different Ids will there be? A fixed handful? Or is it unbounded and could be hundreds or more? i.e. Are there unique Ids per user analyzer in the system? i.e. some for Add-Using, some for Generate Method, etc. etc. Or is it more like just a couple of Ids for compiler diagnostics. Just a couple for all analyzers combined, etc. etc.

I need to know this to determine if we actually need some sort of yielding approach in the OnDiagnosticsUpdatedOnForeground loop, or if we can leave as is. If the number of DiagnosticUpdateArgs.Id's is low/bounded, we can keep as is. Otherwise, we may need to have this yield, and move to processing hte remainder of work in a later timeslice. Or, i can move this to the BG. I'll try that in another PR.

Thanks!

@CyrusNajmabadi
Copy link
Copy Markdown
Contributor Author

DiagnosticUpdateArgs has an 'object Id' property in it. 1 analyzer can have at most 3 different object Id. that's why I said, in current design, 1 analyzer can have 3 taggers per a file.

That's not answering my question :) My question is: around how many unique 'Ids' are there in a normal project? Is it a handful? Or is it unbounded?

@CyrusNajmabadi
Copy link
Copy Markdown
Contributor Author

but the cause of that is due to filtering on UI thread.

No, that's not true. I have filtering on the UI thread in this PR without the need for excessive lambda allocations. So there is no reason that UI thread filtering means excess allocations.

@heejaechang
Copy link
Copy Markdown
Contributor

object id is unique per analyzer, per kind, per document/project, per vsix/nuget analyzer came from.

there will be at least as many object id as number of analyzer, documents, projects in the solution.

@CyrusNajmabadi
Copy link
Copy Markdown
Contributor Author

there will be at least as many object id as number of analyzer, documents, projects in the solution.

Ok. So it's effectively unbounded. You're telling me that in something like Roslyn, there would be thousands of unique Ids, yes?

@CyrusNajmabadi
Copy link
Copy Markdown
Contributor Author

object id is unique per analyzer

I don't know what that means. Are there a lot of analyzers? Or are there a fixed number. For example, do you have one analyzer that handles thigns like Add-Using/Generate-Method, or is it an analyzer per each of these?

@heejaechang
Copy link
Copy Markdown
Contributor

you can remove the excessive lambda allocations by filtering on BG? since it is the tagger that create and enqueue that many lambda when it doesn't need to.

@CyrusNajmabadi
Copy link
Copy Markdown
Contributor Author

CyrusNajmabadi commented Nov 27, 2017

you can remove the excessive lambda allocations by filtering on BG

And i can remove it on the FG.

since it is the tagger that create and enqueue that many lambda when it doesn't need to.

Fixed, as per this PR.

@heejaechang
Copy link
Copy Markdown
Contributor

we don't know how many analyzers there will be. it is the diagnostic analyzer we have been making framework for. user can install as many analyzers as they wish. we do not put limit on how many analyzers they can install to VS or solution.

okay, so at most, number of object Id you can get is number of diagnostic analyzers installed (vsix or nuget) * 3 kinds (syntax, semantic, semantic on other files) * number of (documents + projects in the solution) * number of (analyzer vsix * nugets installed)

@heejaechang
Copy link
Copy Markdown
Contributor

sure, you can remove it on FG, but once you do filtering on BG, you don't need to do it on FG since there won't be as many FG requests as before...

so, you don't need to have dictionaries to hold onto batched requests or the delay and etc.

@heejaechang
Copy link
Copy Markdown
Contributor

we wanted to have 1 tagger for 1 analyzers. but current editor tagger doesn't allow dynamic tagger insertion/deletion once view is created. so we made this aggregating tagger concept and created this nested tagger per 1 analyzers I believe. which we can dynamically add and remove, while we give this aggregated tagger to the editor.

I am not 100% sure whether that 1 tagger per analyzer deal with all different events from that analyzer or we have 1 tagger per each different type of analyzer events (aka object id)

@CyrusNajmabadi
Copy link
Copy Markdown
Contributor Author

I have a followup PR here that moves filtering to the BG as well: #23409

@CyrusNajmabadi
Copy link
Copy Markdown
Contributor Author

Superseded with #23409

@CyrusNajmabadi
Copy link
Copy Markdown
Contributor Author

I am not 100% sure whether that 1 tagger per analyzer deal with all different events from that analyzer or we have 1 tagger per each different type of analyzer events (aka object id)

We have one per Id. So it's whatever granularity the diagnostics service creates Ids for. We can't do any better on our end as that's all the information we're given from DiagnosticUpdateArgs

@CyrusNajmabadi
Copy link
Copy Markdown
Contributor Author

@Pilchie @sharwell This is a much less risky approach to deal with the diagnostic tagging perf issue. It would probably still be good to smoke test. It would be safer to take initially, and then could be followed in a later PR by #23409

@sharwell
Copy link
Copy Markdown
Contributor

@CyrusNajmabadi I'm more confident with #23409, and would prefer to focus the validation on that change.

@CyrusNajmabadi
Copy link
Copy Markdown
Contributor Author

@sharwell That's fine. I just wanted to give you guys options depending on your risk aversion.

@CyrusNajmabadi
Copy link
Copy Markdown
Contributor Author

Closing in favor of: #23448

@CyrusNajmabadi CyrusNajmabadi deleted the bulkProcessDiagnostics branch January 16, 2018 20:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants