Support reporting more accurate 'changed spans' when doing syntactic classification by CyrusNajmabadi · Pull Request #52472 · dotnet/roslyn

CyrusNajmabadi · 2021-04-07T20:45:56Z

Needs a benchmark.

… to make intent clear.

…classification

CyrusNajmabadi · 2021-04-07T20:49:36Z

+        /// incrementally identical, all children of each node will be incrementally identical as well.
+        /// </summary>
+        public bool IsIncrementallyIdenticalTo(SyntaxNode other)
+            => this.Green == other.Green;


paired methods in compiler layer with the existing IsEquivalentTo.

CyrusNajmabadi · 2021-04-07T20:51:01Z

+                }
+
+                // Couldn't compute a narrower range.  Just the mark the entire file as changed.
+                return currentSnapshot.GetFullSpan();


we fallback to old behavior if we can't get a narrower region. this will also be the behavior for F#. (TS doesn't matter as they don't use our classification system, they use TextMate).

CyrusNajmabadi · 2021-04-07T20:52:13Z

+                // We want to compute a minimal change, but we don't want this to run for too long.  So do the
+                // computation work in the threadpool, but also gate how much time we can spend here so that we can let
+                // the editor know about the size of the change asap.
+                using var linkedToken = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken);


i wasn't comfortable with the idea of us just spending arbitrary time computing the diff. though it may not be an issue in practice as i imagine our diffing algorithm (later in this pr) should be blisteringly fast.

CyrusNajmabadi · 2021-04-07T20:52:34Z

+                linkedToken.Cancel();
+
+                // ensure that if we completed because of cancellation, we throw that up.
+                cancellationToken.ThrowIfCancellationRequested();


not sure about this canclelation handshake. i always find linked tokens a bit clunky.

CyrusNajmabadi · 2021-04-07T20:53:56Z

        }
+
+        public async Task<TextChangeRange?> ComputeSyntacticChangeRangeAsync(Document oldDocument, Document newDocument, CancellationToken cancellationToken)
+            => await SyntacticChangeRangeComputer.ComputeSyntacticChangeRangeAsync(oldDocument, newDocument, cancellationToken).ConfigureAwait(false);


async/await needed here as the helper returns the non-nullable version.

CyrusNajmabadi · 2021-04-07T22:36:26Z

 Microsoft.CodeAnalysis.SyntaxContextReceiverCreator
+Microsoft.CodeAnalysis.SyntaxNode.IsIncrementallyIdenticalTo(Microsoft.CodeAnalysis.SyntaxNode! other) -> bool
+Microsoft.CodeAnalysis.SyntaxNodeOrToken.IsIncrementallyIdenticalTo(Microsoft.CodeAnalysis.SyntaxNodeOrToken other) -> bool
+Microsoft.CodeAnalysis.SyntaxToken.IsIncrementallyIdenticalTo(Microsoft.CodeAnalysis.SyntaxToken token) -> bool


@dotnet/roslyn-compiler these are new APIs i'm adding to effectively ask "do these nodes have identical green nodes". It's a way to dtermine extremely quickly what parts of a tree were definitely reused across an incremental edit..

This is used in this PR to take two trees, and conservatively (not minimally) determine the portion of the tree that changed. For any green nodes that are identical between the before/after we know that that didn't change. So we only need to examine nodes that did change. See the logic in SyntacticChangeRangeComputer below for how we utilize this.

If these were added to the public API, I would prefer them be added as static methods in a helper class which are not extension methods so they aren't showing up for users working with these types in common scenarios.

It's a fair point. But I'm not sure a good static class these should go to. There is also precedence here as these three types already have IsEquivalentTo. So this really is a sibling method to those. But I would be fine moving these of people have a good suggestion on where it should go

Honestly, this seems fine to me as a location for the API. As Cyrus mentioned, we already have IsEquivalentTo APIs, and those aren't common-case APIs either.

sharwell · 2021-04-08T01:04:28Z

@CyrusNajmabadi any chance we can add a benchmark for this?

CyrusNajmabadi · 2021-04-08T01:51:09Z

any chance we can add a benchmark for this?

Definitely!

jcouv

Compiler changes LGTM Thanks (iteration 36) with a nit (missing annotation)

ryzngard · 2021-04-14T00:33:18Z

+                }
+
+                // Couldn't compute a narrower range.  Just the mark the entire file as changed.
+                return currentSnapshot.GetFullSpan();


For future people, what is the advice on using GetChangedSpanAsync over currentSnapshot.GetFullSpan in this class?

Effectively (and i will doc this) no one should use GetFullSpan except for thsi code.

…cClassificationTaggerProvider.TagComputer.cs Co-authored-by: Andrew Hall <ryzngard@live.com>

…i/roslyn into classificationDiff

ryzngard · 2021-04-14T00:43:58Z

+        {
+            // If they're the same doc, there is no change.
+            if (oldDocument == newDocument)
+                return new TextChangeRange();


Suggested change

return new TextChangeRange();

return new();

Tiny nit

my preference is to use new() in places that are immediatley apparent. So with X x = new() or X Foo() => new(). Outside of that, i'm currently using sparingly :)

ryzngard · 2021-04-14T00:48:45Z

+    {
+        private static readonly ObjectPool<Stack<SyntaxNodeOrToken>> s_pool = new(() => new());
+
+        public static async ValueTask<TextChangeRange?> ComputeSyntacticChangeRangeAsync(


It's unclear why we allow returning null here. It looks like cases where we timeout waiting on the syntax roots. We still had to compute them unless the cancellation token cancelled, can we use that to return a non-null value here?

Not exactly. We may not have canceled, but we may have otherwise exceeded out time budget. "Cancellation" in these codepaths is for saying: no, stop everything, all results are useless as the client has moved on.

The diffTimeout is different. It's: we still need the results, but we need the fast, so don't do superflous work.

Does that make sense?

Yea, I meant that exceeding the time budge can return null if the root wasn't retrieved in time. It seems odd we have null in that case, sense we're saying "Assume both of these are completely different", based on the comment. An actual result with the whole tree being different would be easier to understand and consume than guessing what null meant. Is it no difference? Is it all different?

An actual result with the whole tree being different would be easier to understand and consume than guessing what null meant. Is it no difference? Is it all different?

I see what you mean. To do this, we'd still need the sizes of both the old and new tree. And that means we'd have to then get both roots. I found it easier to just return null trivially and have the caller know that means that the narrow range could not be computed :)

ryzngard

IDE parts LGTM

jasonmalinowski

Love it!

jasonmalinowski · 2021-04-14T20:28:12Z

+        /// </para>
+        /// </summary>
+        ValueTask<TextChangeRange?> ComputeSyntacticChangeRangeAsync(
+            Document oldDocument, Document newDocument, TimeSpan timeout, CancellationToken cancellationToken);


Do we have any telemetry for how long this is taking in practice? I support the idea of the timeout, but if in practice it turns out entirely unnecessary I'm wondering if it's just better to delete.

The concern i have is that this is a case where i:

expect the normal case to be fast

do not want the uncommon case to negatively impact the user experience

This is hard to really get telemetry on. We'd need to collect info and then know that we truly were never really running into the slow case.

jasonmalinowski · 2021-04-14T20:45:29Z

+                // doc we grab, just that we grab some prior version.  This is only used to narrow down the changed range we 
+                // specify, so it's ok if it's slightly larger because we read in a change from a couple of edits ago.
+                var previousDocument = _lastProcessedDocument;


I admit I'm not quite understanding this comment. Is there any concern that an edit which is undone might somehow cause problems here? Put another way, if we go from version A to B, and B to C (where C is basically A again), could the compiler's caching and reuse of tokens mean A and C could be the same even though we reported tags for B?

yes. this has a bad race condition. not sure how i convinced myself this was safe. will have followup pr.

jasonmalinowski · 2021-04-14T20:56:26Z

+            var oldRoot = await oldDocument.GetRequiredSyntaxRootAsync(cancellationToken).ConfigureAwait(false);
+
+            // If we ran out of time, we have to assume both are completely different.
+            if (stopwatch.Elapsed > timeout)


It seems a bit odd to me that we're including the fetching of roots in the timeout when we already did that at the caching layer higher. I get the desire that the language service returns a generic "cached object" but it's a bit strange to have timeout code here running which we expect to be instantaneous. Also a bit strange that the "this is designed to be fast" algorithm is an async method in the first place, since the only way it's ever fast is if the caller already knows the roots are available.

jasonmalinowski · 2021-04-14T21:09:10Z

+            using var leftOldStack = s_pool.GetPooledObject();
+            using var leftNewStack = s_pool.GetPooledObject();
+            using var rightOldStack = s_pool.GetPooledObject();
+            using var rightNewStack = s_pool.GetPooledObject();
+
+            leftOldStack.Object.Push(oldRoot);
+            leftNewStack.Object.Push(newRoot);
+            rightOldStack.Object.Push(oldRoot);
+            rightNewStack.Object.Push(newRoot);


The stacks are only used in the implementations of the local methods? Just move them in? That way we're not allocating ones which may not be used (the right stacks), and it probably also means the local functions can become static?

jasonmalinowski · 2021-04-14T21:17:57Z

+                    // Similarly, if we've run out of time, just return what we've computed so far.  It's not as accurate as
+                    // we could be.  But the caller wants the results asap.
+                    if (stopwatch.Elapsed > timeout)
+                        return currentOld.FullSpan.Start;


I love that we're able to give a "good enough" answer like this.

CyrusNajmabadi added 4 commits April 6, 2021 15:57

Explicitly hold onto syntax tree while doing syntactic classification…

d17ca65

… to make intent clear.

REstore.

3bfe228

Add helpers

f5e6506

Support reporting more accurate 'changed spans' when doing syntactic …

c11f1c0

…classification

CyrusNajmabadi requested a review from jasonmalinowski April 7, 2021 20:45

ghost added the Area-IDE label Apr 7, 2021

CyrusNajmabadi commented Apr 7, 2021

View reviewed changes

Add tests

b04456c

CyrusNajmabadi marked this pull request as ready for review April 7, 2021 22:33

CyrusNajmabadi requested review from a team as code owners April 7, 2021 22:33

Add to public api

0c06cd0

CyrusNajmabadi commented Apr 7, 2021

View reviewed changes

CyrusNajmabadi added 3 commits April 7, 2021 15:38

Tweaks

dc78417

Add docs

6611ac7

Add tests

fdb694e

Add actual test of the tagger.

e4c630a

sharwell marked this pull request as draft April 8, 2021 15:08

This comment has been minimized.

Sign in to view

CyrusNajmabadi marked this pull request as ready for review April 8, 2021 15:34

This comment has been minimized.

Sign in to view

jcouv reviewed Apr 13, 2021

View reviewed changes

Comment thread src/Compilers/Core/Portable/Syntax/SyntaxNode.cs Outdated

jcouv approved these changes Apr 13, 2021

View reviewed changes

ryzngard reviewed Apr 14, 2021

View reviewed changes

Comment thread ...ures/Core/Implementation/Classification/SyntacticClassificationTaggerProvider.TagComputer.cs Outdated

ryzngard reviewed Apr 14, 2021

View reviewed changes

Comment thread ...ures/Core/Implementation/Classification/SyntacticClassificationTaggerProvider.TagComputer.cs Outdated

ryzngard reviewed Apr 14, 2021

View reviewed changes

Comment thread ...ures/Core/Implementation/Classification/SyntacticClassificationTaggerProvider.TagComputer.cs Outdated

ryzngard reviewed Apr 14, 2021

View reviewed changes

Comment thread ...ures/Core/Implementation/Classification/SyntacticClassificationTaggerProvider.TagComputer.cs Outdated

ryzngard reviewed Apr 14, 2021

View reviewed changes

CyrusNajmabadi and others added 6 commits April 13, 2021 17:33

Merge remote-tracking branch 'upstream/main' into classificationDiff

7b751cb

Mark as nullable

935b654

Update src/EditorFeatures/Core/Implementation/Classification/Syntacti…

86b558f

…cClassificationTaggerProvider.TagComputer.cs Co-authored-by: Andrew Hall <ryzngard@live.com>

Simplify

00a09e2

FInish comment

3f88d99

Merge branch 'classificationDiff' of https://github.com/CyrusNajmabad…

134e99c

…i/roslyn into classificationDiff

ryzngard reviewed Apr 14, 2021

View reviewed changes

Comment thread src/Tools/ExternalAccess/FSharp/Internal/Classification/FSharpClassificationService.cs Outdated

Simplify code

9d3baec

ryzngard reviewed Apr 14, 2021

View reviewed changes

Simplify

8de0e62

ryzngard reviewed Apr 14, 2021

View reviewed changes

CyrusNajmabadi added 2 commits April 13, 2021 18:05

Fix api

8d18526

Remove item

2680ed6

CyrusNajmabadi requested a review from ryzngard April 14, 2021 02:58

ryzngard approved these changes Apr 14, 2021

View reviewed changes

davidwengier approved these changes Apr 14, 2021

View reviewed changes

CyrusNajmabadi merged commit 8ce6cf4 into dotnet:main Apr 14, 2021

ghost added this to the Next milestone Apr 14, 2021

CyrusNajmabadi deleted the classificationDiff branch April 14, 2021 05:27

jasonmalinowski reviewed Apr 14, 2021

View reviewed changes

CyrusNajmabadi mentioned this pull request Apr 15, 2021

Simplify syntactic classification #52662

Merged

dibarbet modified the milestones: Next, 16.10.P3 Apr 26, 2021

Conversation

CyrusNajmabadi commented Apr 7, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sharwell commented Apr 8, 2021

Uh oh!

CyrusNajmabadi commented Apr 8, 2021

Uh oh!

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Uh oh!

jcouv left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ryzngard left a comment

Choose a reason for hiding this comment

Uh oh!

jasonmalinowski left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

CyrusNajmabadi commented Apr 7, 2021 •

edited

Loading