Skip to content

Tweak delays we have before we start incrementally analyzing in OOP.#46426

Merged
6 commits merged intodotnet:masterfrom
CyrusNajmabadi:formPerf2
Jul 30, 2020
Merged

Tweak delays we have before we start incrementally analyzing in OOP.#46426
6 commits merged intodotnet:masterfrom
CyrusNajmabadi:formPerf2

Conversation

@CyrusNajmabadi
Copy link
Contributor

Followup to #46391. That should be reviewed and merged first.'

Our existing OOP system works in the following manner:

  1. change happens in VS.
  2. VS waits 4-5 seconds to sync that change to OOP, with the presumption that the information will not be needed tehre soon, but that its' still good to generally be keeping OOP near up to date, so that when an OOP request does go in, there isn't that much to sync (effectively just the last 5 seconds of changes).
  3. OOP will hear about the change, update it's solution snapshot, then kick off incremental analyzing
  4. incremental analyzing will process high-pri stuff, and then back off another 1.5 seconds to start doing normal analysis.
  5. results found by an incremental analyzer are sent to VS, where they are batched to be processed in the future.
  6. 1 second later, the batch is worked on, real results are determined and VS is notified about this.

All in all, this leads to a situation where there can be a minimum of around 7.5 seconds between a text change, and a corresponding VS UI update. This is very noticeable and makes the product feel broken.

The above worked well because we didn't use to do things proactively in OOP. i.e. OOP was a place where requests were remoted in response to an explicit user request (i.e. user explicitly invokign FAR). At that point, we'd just need to sync a little data (fast), and then perform our op and return the results.

Now that we have proactive OOP subsystems running automatically in response to changes, the above isn't viable as it just takes far too long before the work even kicks off. I've changed the above to now be:

  1. We start syncing to oop 500ms after a change. This ensures that a large flurry of changes still backs off, but that one there's a slight pause, that OOP knows about it and can go.
  2. OOP syncs things over and kicks off incremental analyzer.
  3. incremental analyzer still waits 1.5 seconds here. This is a standard wait time that we've found strikes a good balance between not doing too much work, or being distracting, while also getting to things in a timely manner when teh user makes a change tehy do care about.
  4. results are then sent back to VS which just waits 250ms. Again, this is to at least be able to process a large flurry of messages in a single go, rather than processing each one sequentially with high overhead (esp. if a message has to go back to the UI thread).

Things brings our time delay for things like updating the winforms designer icon from 7+ seconds down to a much more reasonable 2.25 seconds. This does not feel immediate, but feels much more reasonable. Instead of taking so long that it feels like something is actually not working, things happen fast enough that you can move to your next task.

@CyrusNajmabadi CyrusNajmabadi requested a review from a team as a code owner July 29, 2020 20:05
// for primary workspace in OOP to be synced to latest.
public static readonly Option<int> SolutionChecksumMonitorBackOffTimeSpanInMS = new Option<int>(
nameof(InternalFeatureOnOffOptions), nameof(SolutionChecksumMonitorBackOffTimeSpanInMS), defaultValue: 4000,
nameof(InternalFeatureOnOffOptions), nameof(SolutionChecksumMonitorBackOffTimeSpanInMS), defaultValue: 500,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that this regresses other experiences? For example, now that we much more aggressively push new snapshots to OOP, any other OOP requests, say active file diagnostics request from a subsequent edit, get delayed, taxing the whole system? As in, instead of making OOP computations and sync happen more often, it is better to work towards a pull model where any analysis/compute that needs to be responsive should have an inproc component (either in Roslyn or above the stack) that pulls on this data?

Copy link
Contributor Author

@CyrusNajmabadi CyrusNajmabadi Jul 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i used this for a couple of hours today and didn't feel like anything was regressed.

For example, now that we much more aggressively push new snapshots to OOP, any other OOP requests, say active file diagnostics request from a subsequent edit, get delayed, taxing the whole system?

I technically don't think this is an issue. here's why:

First, syncing this stuff is not expensive. i.e. we're literally saying: if anything changed, just push that over in the next half second. It's extremely difficult for that to be a lot of work as effectively we only have one changes were possible in that time frame.

Second, when requests are actually explicitly made for some snapshot, that syncing has to happen anyways. And syncing is complimentary. i.e. when the 'active file diagnostics' request goes in and asks to sync the snapshot, it benefits from the preexisting pushed information that we already started on. even if the particular file it is asking for has changed, it still gets all the other changes pushed over, and only really needs to potentially just sync over it's very latest text change.

it is better to work towards a pull model where any analysis/compute that needs to be responsive should have an inproc component (either in Roslyn or above the stack) that pulls on this data?

Yes, i believe that's a better long term approach. This is intended to put us in a better short term position though until partner teams can work with us on an appropriate 'pull API' here.

new IncrementalAnalyzerProviderMetadata(
nameof(RemoteDesignerAttributeIncrementalAnalyzerProvider),
highPriorityForActiveFile: false,
highPriorityForActiveFile: true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is no longer relevant right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it still is for me, eve if it's not respected. i.e. if our OOP crawler can support this concept at some point, we would want this prioritization for these files. It will mean that changes to them are realized in the UI at least a second faster, which would be preferred over the current approach. i.e. we'd likely get down to about 1-1.5 seconds from teh current 2-2.5 seconds.

@CyrusNajmabadi
Copy link
Contributor Author

Note: i've been running with this all day long, and things feel fine for me. I can't really perceive any issues here with anything feeling worse/better after my change.

writer.WriteString(data.Category);
projectVersion.WriteTo(writer);
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missed removing this from previosu PR.

@ghost
Copy link

ghost commented Jul 30, 2020

Hello @CyrusNajmabadi!

Because this pull request has the auto-merge label, I will be glad to assist with helping to merge this pull request once all check-in policies pass.

p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (@msftbot) and give me an instruction to get started! Learn more here.

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Auto-approval

@jmarolf
Copy link
Contributor

jmarolf commented Jul 30, 2020

@CyrusNajmabadi I am in favor of this change, but I would like a follow up test plan for how we are going to track the user experience here. Right now checking it in and dogfooding seems to be the only way to verify this. I don't think we should block this PR, but I would feel more comfortable if we had some detailed thoughts about how to track the experience here for all users.

@ghost ghost merged commit 0cff1ac into dotnet:master Jul 30, 2020
@ghost ghost added this to the Next milestone Jul 30, 2020
@CyrusNajmabadi
Copy link
Contributor Author

@jmarolf did you have something in mind?

@CyrusNajmabadi CyrusNajmabadi deleted the formPerf2 branch July 30, 2020 18:17
@jmarolf
Copy link
Contributor

jmarolf commented Jul 30, 2020

well the question we want to answer is "How long does it take for an update to round-trip through OOP for users". So I think telemetry + a dashboard would be a reasonable first approach.

It may be that we need to develop a more complete acceptance test for the round-trp experience that can run on CI but that could probably happen later.

Is the question I've identified a correct way to think about the experience though?

Copy link
Member

@tmat tmat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sharwell
Copy link
Contributor

If we start monitoring round-trip times, it would be really interesting to put in a progressive backoff here. If load is low, latency is near-instant. But if load is high then we increase the latency a bit.

JoeRobich added a commit that referenced this pull request Aug 4, 2020
This reverts commit 0cff1ac, reversing
changes made to faba4b4.
RikkiGibson added a commit that referenced this pull request Aug 4, 2020
This reverts commit 0cff1ac, reversing
changes made to faba4b4.
@RikkiGibson RikkiGibson modified the milestones: Next, 16.8.P2 Aug 11, 2020
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants