Avoid OOP syncing when doing a nav-to search during the initial load of a solution. by CyrusNajmabadi · Pull Request #52351 · dotnet/roslyn

CyrusNajmabadi · 2021-04-01T23:19:54Z

Followup to #52315. This can be reviewed once that goes in.

This PR changes our behavior around nav-to during the period of time when the solution is loading. In the previous PR we switched to not generate SG files during solution load, but now takes things further. In this PR we also make it so we don't even sync the host with OOP before running the nav to search. Instead, we just take whatever data we have stored in the OOP cache and query it directly for matches, reporting whatever we have.

Because of this, we save on the entirety of the OOP sync cost, which on my machine (with release, ngen'ed, etc. dlls) is already around 35 seconds. Without this, the entire operation goes down to around 2 seconds to search the entire nav-to DB.

However, this performance comes with a caveat. Because we are not syncing our host data to oop, it's possible that we will find results for files that either no longer exist, or for locations in files that are incorrect. For the first case, we filter out any such results on the host side when we get the answers back from oop. For the second case, we accept that the locations may be temporarily incorrect until the solution is fully loaded, and we then go query for accurate results.

During this time, we also show this information to the user:

This lets them know that data may not be complete during this time, helping to prime an expectation that this is a 'best effort' result, and they may need to rerun the search later to get full results.

CyrusNajmabadi · 2021-04-02T08:32:37Z

src/EditorFeatures/Core.Wpf/Interactive/InteractiveDocumentNavigationService.cs

            => false;

-        public bool TryNavigateToSpan(Workspace workspace, DocumentId documentId, TextSpan textSpan, OptionSet options, CancellationToken cancellationToken)
+        public bool TryNavigateToSpan(Workspace workspace, DocumentId documentId, TextSpan textSpan, OptionSet options, bool allowInvalidSpan, CancellationToken cancellationToken)


a general concept here is that navto can return old data from a previous session which may not be invalid. currently TryNavigateTo can throw if someone passes in a span that isn't in teh bounds of the current doc (despite being called 'try...'). This now allows that behavior to be controlled. By default, we still throw (so that we can catch actual logic bugs in our code). However, in the case that we know we may truly not expect to be within the bounds of the doc anymore, we dont' want to throw as that could reasonable happen.

CyrusNajmabadi · 2021-04-02T08:36:06Z

src/Features/Core/Portable/NavigateTo/NavigateToSearcher.cs

                // make sure we only process this project if we didn't already process it above.
                if (processedProjects.Add(currentProject))
-                    tasks.Add(Task.Run(() => SearchAsync(currentProject, priorityDocs.ToImmutableArray(), seenItems, isFullyLoaded), _cancellationToken));
+                    tasks.Add(Task.Run(() => SearchAsync(currentProject, priorityDocs.Where(d => d.Project == currentProject).ToImmutableArray(), seenItems, isFullyLoaded), _cancellationToken));


previously we could search a project, but then say: hey... tehse docs from this other project should be prioritized. It's pretty non-sensical and led to more complex reasoning downsteram. Now, i've strictly doc'ed in teh interface that the priority docs are always from the project being passed in, and it means cleaner code and easier reasoning downstream.

CyrusNajmabadi · 2021-04-02T08:37:53Z

src/VisualStudio/Core/Def/Implementation/Workspace/VisualStudioDocumentNavigationService.cs

            {
                var boundedTextSpan = GetSpanWithinDocumentBounds(textSpan, text.Length);
-                if (boundedTextSpan != textSpan)
+                if (boundedTextSpan != textSpan && !allowInvalidSpan)


we're always resilient to the span being requested not being in the span of the tree. however, by default, we will log a watson for that case right below this. Now, in the case of showing stale navto entries, we do not report such a watson as being not within bounds of hte doc is not considered an exceptional circumstance.

CyrusNajmabadi · 2021-04-02T08:38:44Z

src/Workspaces/Core/Portable/Classification/IRemoteSemanticClassificationCacheService.cs

        /// classifications are only returned if they match the content the file currently has.</param>
        ValueTask<SerializableClassifiedSpans?> GetCachedSemanticClassificationsAsync(
-            SerializableDocumentKey documentKey,
+            DocumentKey documentKey,


we had a dual of a primitive data type and it's 'serializable' equivalent. however, the primitive data type was trivial to make serializable itself. so this dual just went away.

CyrusNajmabadi · 2021-04-02T08:39:24Z

src/Workspaces/Core/Portable/Storage/AbstractPersistentStorageService.cs

+            if (result != null)
+                return result;
+
+            return NoOpPersistentStorage.Instance;


no change in logic. this just makes it easier to set a breakpoint on the failure case when debugging issues.

dibarbet

are there existing tests for the non-fully loaded code path that covers all the new cached code paths?

dibarbet · 2021-04-02T18:06:44Z

src/Features/Core/Portable/NavigateTo/AbstractNavigateToSearchService.InProcess.cs

                (PatternMatchKind.LowercaseSubstring, NavigateToMatchKind.Fuzzy));

-        public static Task SearchProjectInCurrentProcessAsync(
+        public static Task SearchFullyLoadedProjectInCurrentProcessAsync(


is this fully loaded in the context of the operation progress service, or fully loaded in that the solution is sync'd to the oop?

fully loaded in the context of the operation-progress-service. effectively: if the project system has handed everything over to us, then we do the full search. otherwise, if they're still loading, we do the cached search.

src/Features/Core/Portable/NavigateTo/AbstractNavigateToSearchService.cs

src/Features/Core/Portable/NavigateTo/INavigateToSearchService.cs

dibarbet · 2021-04-02T18:46:40Z

src/Features/Core/Portable/NavigateTo/AbstractNavigateToSearchService.cs

+
+        public Task SearchProjectAsync(Project project, ImmutableArray<Document> priorityDocuments, string searchPattern, IImmutableSet<string> kinds, Func<INavigateToSearchResult, Task> onResultFound, bool isFullyLoaded, CancellationToken cancellationToken)
+        {
+            return isFullyLoaded


So it seems like when the solution is not fully loaded (from operation progress), we now search the cached results instead of the actual documents which avoids waiting for the oop. But all projects may not be loaded yet so the document set being searched in the cached results may not be exhaustive.

This probably isn't relevant to this PR specifically, but is it possible to cache the projects for a solution? So that even if the projects aren't fully loaded yet, the cached search can find them based on the solution being searched?

But all projects may not be loaded yet so the document set being searched in the cached results may not be exhaustive.

Correct. And the UI informs the user of that.

This probably isn't relevant to this PR specifically, but is it possible to cache the projects for a solution? So that even if the projects aren't fully loaded yet, the cached search can find them based on the solution being searched?

It is probably something we could do in teh future :) it's def an interesting idea. my general concern about those sorts of things is ensuring they get properly invalidated/updated as the solution changes over time.

src/Workspaces/Core/Portable/Workspace/Host/PersistentStorage/DocumentKey.cs

src/Workspaces/Remote/ServiceHub/Services/NavigateToSearch/RemoteNavigateToSearchService.cs

dibarbet · 2021-04-02T19:03:04Z

src/Workspaces/Core/Portable/FindSymbols/SyntaxTree/SyntaxTreeIndex_Persistence.cs

+        private static Task<SyntaxTreeIndex?> LoadAsync(Document document, Checksum checksum, CancellationToken cancellationToken)
+            => LoadAsync(document.Project.Solution.Workspace, DocumentKey.ToDocumentKey(document), checksum, GetStringTable(document.Project), cancellationToken);

+        public static async Task<SyntaxTreeIndex?> LoadAsync(


should this take in the storage service instead of the workspace? iirc Tomas was wanting to remove workspace references from the OOP, so moving the workspace references up to the top might make it easier to remove later on down the line.

i think that will take a lot of work no matter what. so i'd be ok just waiting until that stage.

dibarbet · 2021-04-02T19:06:28Z

src/Workspaces/Core/Portable/FindSymbols/SyntaxTree/SyntaxTreeIndex_Persistence.cs


        private static SyntaxTreeIndex? ReadFrom(
-            StringTable stringTable, ObjectReader reader, Checksum checksum)
+            StringTable stringTable, ObjectReader reader, Checksum? checksum)


I was looking through the code to see if the usages of checksum allowed nullable, and it appears like it should. But the constructor for SyntaxTreeIndex is not nullable enabled
http://sourceroslyn.io/#Microsoft.CodeAnalysis.Workspaces/FindSymbols/SyntaxTree/SyntaxTreeIndex.cs,28

But the property that checksum gets assigned to is in a nullable enabled file and is specifically not null
http://sourceroslyn.io/#Microsoft.CodeAnalysis.Workspaces/FindSymbols/SyntaxTree/SyntaxTreeIndex_Persistence.cs,c8398183df844544

Might be good to update the annotations there to explicitly allow null (and nullable enable the ctor)

sure. will do.

CyrusNajmabadi added 8 commits April 1, 2021 12:34

Ensure that priority docs only refer to the project being searched

4fb7f43

Thread things along

c96203b

Extract out common core.

c45e435

Merge branch 'simpleNavToSerialization' into navToOnLoad

f1698cf

Add code to search from index

65298a8

working

459edc9

Be resilient to stale items that are out of bounds.

9faa67f

Merge branch 'simpleNavToSerialization' into navToOnLoad

7b17ab7

CyrusNajmabadi requested review from a team as code owners April 1, 2021 23:19

CyrusNajmabadi requested a review from a team April 1, 2021 23:19

ghost added the Area-IDE label Apr 1, 2021

CyrusNajmabadi added 3 commits April 2, 2021 01:27

Merge remote-tracking branch 'upstream/main' into navToOnLoad

254969c

Pass along flag.

5cfdb10

Pass along flag.

0f32f6b

CyrusNajmabadi commented Apr 2, 2021

View reviewed changes

Add comment

ddb9f1c

CyrusNajmabadi commented Apr 2, 2021

View reviewed changes

CyrusNajmabadi requested review from davidwengier, dibarbet and jasonmalinowski and removed request for a team April 2, 2021 08:46

DOcs

6fba65a

dibarbet reviewed Apr 2, 2021

View reviewed changes

CyrusNajmabadi added 2 commits April 2, 2021 13:38

remove unused value

8c22ff5

docs

d0b25e6

CyrusNajmabadi added 3 commits April 2, 2021 13:42

NRT enable

affd18b

Renames

b555e8c

NRTwork

d23017b

CyrusNajmabadi enabled auto-merge April 2, 2021 20:54

dibarbet approved these changes Apr 2, 2021

View reviewed changes

CyrusNajmabadi mentioned this pull request Apr 2, 2021

Return cached results from nav to while we are loading the solution #52380

Merged

Missing cancellation token

4e88fe2

CyrusNajmabadi merged commit ebcd714 into dotnet:main Apr 3, 2021

ghost added this to the Next milestone Apr 3, 2021

CyrusNajmabadi deleted the navToOnLoad branch April 11, 2021 18:20

dibarbet modified the milestones: Next, 16.10.P3 Apr 26, 2021

Conversation

CyrusNajmabadi commented Apr 1, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dibarbet left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants