Move to a simpler form of checksum->solution caching in the remote workspace layer. by CyrusNajmabadi · Pull Request #62872 · dotnet/roslyn

CyrusNajmabadi · 2022-07-22T18:12:14Z

The main thing this improves is that currently we have a very very weird model in RemoteWorkspace where a single checksum can map both to a forked solution, but then potentially the primary workspace solution later on (once an 'UpdatePrimaryWorkspace' call comes in). The existing code did not handle this well which led to some serious confusion about which solution to use.

The new code cleanly splits the concepts of the cache for the primary workspace solutions and the cache for non-primary ones, making it easier to reason about.

CyrusNajmabadi · 2022-07-22T18:14:22Z

src/Workspaces/Remote/ServiceHub/Host/RemoteWorkspace.cs

-            await SetLastRequestedSolutionAsync(solutionChecksum, fromPrimaryBranch, refCountedLazySolution, cancellationToken).ConfigureAwait(false);
+            await _anyBranchSolutionCache.SetLastRequestedSolutionAsync(solutionChecksum, refCountedLazySolution, cancellationToken).ConfigureAwait(false);
+            if (updatePrimaryBranch)
+                await _primaryBranchSolutionCache.SetLastRequestedSolutionAsync(solutionChecksum, refCountedLazySolution, cancellationToken).ConfigureAwait(false);


just inlined this method as i didn't see much benefit in the helper.

CyrusNajmabadi · 2022-07-22T18:14:41Z

src/Workspaces/Remote/ServiceHub/Host/RemoteWorkspace.cs

-            // If this was a notification about the primary solution, then attempt to promote any solution we found to
-            // be the solution for this workspace.
-            if (fromPrimaryBranch)
-                (newSolution, _) = await TryUpdateWorkspaceCurrentSolutionAsync(workspaceVersion, newSolution, cancellationToken).ConfigureAwait(false);


this logic is now part of GetOrCreateSolutionAsync

src/Workspaces/Remote/ServiceHub/Host/RemoteWorkspace.cs

CyrusNajmabadi · 2022-07-22T18:19:45Z

src/Workspaces/Remote/ServiceHub/Host/RemoteWorkspace.cs

-        {
-            var currentSolution = this.CurrentSolution;
+            // We were asked to update the primary-branch solution.  So take the any-branch solution and promote it to
+            // the primary-branch-level.


this is the clever bit. we effectively ensure that whatever solution we made for this checksum gets promoted to be in the _primarySolutionCache.

What's critical to understand here is that each cache may have teh same checksum point to a different solution. The _anyBranch cache can point from that checksum to a 'forked solution' that it generated (which then has some random branch id).

The _primaryBranch cache will then instead map from the checksum to an actual primary-branch-id solution if possible (which may be a different instance than the forked solution).

Later lookups always look in the _primaryBranch cache first, so this better solution will always be found.

This was much simpler to code up and ensure proper working versus trying to have a single mapping from checksum->solution that then might have to mutate in order update what that checksum pointed at.

CyrusNajmabadi · 2022-07-22T18:20:48Z

src/Workspaces/Remote/ServiceHub/Host/RemoteWorkspace.cs

+                _gate = gate;
+            }
+
+            public async ValueTask<ReferenceCountedDisposable<LazySolution>?> TryFastGetSolutionAsync(


these methods are basically the same named methods from before, just simplified now that this cache object only stores the last accessed item.

CyrusNajmabadi · 2022-07-22T18:21:49Z

src/Workspaces/Remote/ServiceHub/Host/RemoteWorkspace.cs

+            }
+
+            public async ValueTask<ReferenceCountedDisposable<LazySolution>> SlowGetOrCreateSolutionAsync(
+                Checksum solutionChecksum, Func<CancellationToken, Task<Solution>> getSolutionAsync, CancellationToken cancellationToken)


this is the same as before, except that getSolutionAsync can be passed in since there is different behavior for the anyBranch cache versus the primaryBranchCache.

CyrusNajmabadi · 2022-07-22T18:22:25Z

src/Workspaces/Remote/ServiceHub/Host/RemoteWorkspace.cs

+                await _anyBranchSolutionCache.SlowGetOrCreateSolutionAsync(
+                    solutionChecksum,
+                    cancellationToken => ComputeSolutionAsync(assetProvider, solutionChecksum, cancellationToken),
+                    cancellationToken).ConfigureAwait(false);


note: teh lambda we pass in here is the same code that used to be in SlowGetSolutionAsync at this line:

https://github.com/dotnet/roslyn/pull/62872/files#diff-fedbb1bbc0cee7c7ea4dbd51b8b27023e351ec9a062fb4623fdb8063b5517d30L271

CyrusNajmabadi · 2022-07-22T18:23:19Z

src/Workspaces/Remote/ServiceHub/Host/RemoteWorkspace.cs

+                    var anyBranchSolution = await anyBranchRefCountedSolution.Target.Task.WithCancellation(cancellationToken).ConfigureAwait(false);
+                    var (primaryBranchSolution, _) = await this.TryUpdateWorkspaceCurrentSolutionAsync(workspaceVersion, anyBranchSolution, cancellationToken).ConfigureAwait(false);
+                    return primaryBranchSolution;
+                },


however, this lambda is different. for the primaryBranch cache we first get hte anyBranch solution, then try to promote it to be the primary branch solution for this workspace. We then cache that new solution instead for this checksum.

CyrusNajmabadi · 2022-07-22T18:24:11Z

src/Workspaces/Remote/ServiceHub/Host/RemoteWorkspace.cs

+            }
+
+            public async Task SetLastRequestedSolutionAsync(Checksum solutionChecksum, ReferenceCountedDisposable<LazySolution> solution, CancellationToken cancellationToken)
+            {


same as before, just simpler as there's only one 'last requested' solution to update.

CyrusNajmabadi · 2022-07-27T21:52:59Z

src/VisualStudio/Core/Test.Next/Services/SolutionServiceTests.cs


            var solutionChecksum = await solution.State.GetChecksumAsync(CancellationToken.None);
-            var synched = await remoteWorkspace.GetTestAccessor().GetSolutionAsync(assetProvider, solutionChecksum, fromPrimaryBranch: false, workspaceVersion: -1, CancellationToken.None);
+            var synched = await remoteWorkspace.GetTestAccessor().GetSolutionAsync(assetProvider, solutionChecksum, updatePrimaryBranch: false, workspaceVersion: -1, CancellationToken.None);


renamed this to be clear what this flag is intending to convey. the flag it an actual request to the RemoteWorkspace to not only get the solution but update it's primary solution to point at that.

CyrusNajmabadi · 2022-07-27T21:54:50Z

...s/Remote/ServiceHub/Host/RemoteWorkspace.ChecksumToSolutionCache.SolutionAndInFlightCount.cs

+            /// <summary>
+            /// Wrapper around asynchronously produced solution.  The computation for producing the solution will be
+            /// canceled when the number of in-flight operations using it goes down to 0.
+            /// </summary>


i moved away from ReferenceCountedDisposable. It was getting very hard to track what was going on, and the indirections it forced added a lot of complexity around locking and updating internal state here. By using a private type for the cache, and the item being cached, we can safely and uniformly lock around everything, which enables us to perform all operations, synchronously, atomically and sanely.

CyrusNajmabadi · 2022-07-27T21:59:13Z

...s/Remote/ServiceHub/Host/RemoteWorkspace.ChecksumToSolutionCache.SolutionAndInFlightCount.cs

+                public void IncrementInFlightCount_WhileAlreadyHoldingLock()
+                {
+                    Contract.ThrowIfFalse(_cache._gate.CurrentCount == 0);
+                    Contract.ThrowIfTrue(_inFlightCount < 1);


you'll see a ton of contract checks around the bookkeeping. I absolutely do not want to get this wrong as taht could cause:

an incoming request to try to use an async-solution in teh canceled state (which would cause cancellation throws on a different token).

a solution to be cached forever if we don't properly clean up after ourselves.

CyrusNajmabadi · 2022-07-27T22:00:43Z

...s/Remote/ServiceHub/Host/RemoteWorkspace.ChecksumToSolutionCache.SolutionAndInFlightCount.cs

+
+                public void IncrementInFlightCount()
+                {
+                    using (_cache._gate.DisposableWait(CancellationToken.None))


takign this gate is never cancellable in this type. We always want the increment/decrement code (and any cleanup that causes) to always run so we're in a consistent state.

CyrusNajmabadi · 2022-07-27T22:01:20Z

src/Workspaces/Remote/ServiceHub/Host/RemoteWorkspace.ChecksumToSolutionCache.cs

+                using (await _gate.DisposableWaitAsync(cancellationToken).ConfigureAwait(false))
+                {
+                    // From this point on we are mutating state.  Ensure we absolutely do not cancel accidentally.
+                    cancellationToken = default;


once we acquire these locks, i always null-out the cancellation token so we don't even write an errant line that causes us to cancel after we've done some mutation.

any reason to not use CancellationToken.None?

CyrusNajmabadi · 2022-07-27T22:05:29Z

src/Workspaces/Remote/ServiceHub/Host/RemoteWorkspace.cs

@@ -3,18 +3,14 @@
 // See the LICENSE file in the project root for more information.


this file is effectively rewritten. I recommend looking at the after side.

CyrusNajmabadi · 2022-07-27T22:09:36Z

...s/Remote/ServiceHub/Host/RemoteWorkspace.ChecksumToSolutionCache.SolutionAndInFlightCount.cs

@@ -0,0 +1,96 @@
+// Licensed to the .NET Foundation under one or more agreements.


The following two types/files (ChecksumToSolutionCache and SolutionAndInFlightCount) are utility types that the main bulk of the change benefits from. I tried to keep them as clear as i coudl given that they're implementing a ref-counting system.

CyrusNajmabadi · 2022-07-27T22:11:02Z

src/Workspaces/Remote/ServiceHub/Host/RemoteWorkspace.cs

+            bool updatePrimaryBranch,
            Func<Solution, ValueTask<T>> implementation,
            CancellationToken cancellationToken)
        {


this is the meat of the change and is very relevant.

CyrusNajmabadi · 2022-07-27T22:13:31Z

@dibarbet ptal.

…cache

dibarbet · 2022-07-27T22:46:08Z

src/Workspaces/Remote/ServiceHub/Host/RemoteWorkspace.ChecksumToSolutionCache.cs

+            /// intervening updates, we can cache and keep the solution around instead of having to recompute it.
+            /// </summary>
+            private Checksum? _lastRequestedChecksum;
+            private SolutionAndInFlightCount? _lastRequestedSolution;


this is what we used to use the primary branch solution for correct?

dibarbet · 2022-07-27T22:46:51Z

src/Workspaces/Remote/ServiceHub/Host/RemoteWorkspace.ChecksumToSolutionCache.cs

+                using (await _gate.DisposableWaitAsync(cancellationToken).ConfigureAwait(false))
+                {
+                    // From this point on we are mutating state.  Ensure we absolutely do not cancel accidentally.
+                    cancellationToken = default;


any reason to not use CancellationToken.None?

dibarbet · 2022-07-27T22:57:47Z

src/Workspaces/Remote/ServiceHub/Host/RemoteWorkspace.cs

+                // solution we just computed, even if we have returned.  This also ensures that if we promoted a
+                // non-primary-solution to a primary-solution that it will now take precedence in all our caches for this
+                // particular checksum.
+                await _anyBranchSolutionCache.SetLastRequestedSolutionAsync(solutionChecksum, solution, cancellationToken).ConfigureAwait(false);


I am a little suspicious of this - in that the retrieval of the last solution and the update of the last solution are not sharing the lock in the cache.

It would allow multiple requests to race between getting the solution and updating the last solution (e.g an 'older solution could overwrite the real last request solution). Maybe it doesnt matter, but it seems like the get and update to the cache should happen under a single request to the lock.

so the last solution just becomes an impl detail of the cache

i like that a lot. let me try to make that change.

ok. change made.

CyrusNajmabadi · 2022-07-28T18:18:59Z

any reason to not use CancellationToken.None?

Nope. I'll switch to that. They're basically interchangeable to me since we write CancellationToken cancellationToken = default so much in parameter lists.

CyrusNajmabadi · 2022-07-28T21:06:23Z

closing out in favor of #63028

CyrusNajmabadi added 8 commits July 22, 2022 09:12

In progress

1b5207b

Simpler approach

fa8dc59

Update tsts

61f8dd9

Properly promote

01b20c8

inline

a1a57ec

Simplify

d8ea7de

Docs

2b4f828

Merge remote-tracking branch 'upstream/main' into oopCaching3

ea634e3

CyrusNajmabadi requested a review from dibarbet July 22, 2022 18:12

ghost added the Area-IDE label Jul 22, 2022

Revert

73e8922

CyrusNajmabadi commented Jul 22, 2022

View reviewed changes

src/Workspaces/Remote/ServiceHub/Host/RemoteWorkspace.cs Show resolved Hide resolved

CyrusNajmabadi commented Jul 22, 2022

View reviewed changes

CyrusNajmabadi marked this pull request as ready for review July 22, 2022 18:25

CyrusNajmabadi requested a review from a team as a code owner July 22, 2022 18:25

CyrusNajmabadi marked this pull request as draft July 22, 2022 18:29

CyrusNajmabadi marked this pull request as ready for review July 22, 2022 18:29

CyrusNajmabadi added 6 commits July 22, 2022 12:09

Adjust logic

566078d

Remove special case

62732c5

In progress

640c557

Simplify

9798f8e

Simplify

53952a8

Simplify

a3a2280

CyrusNajmabadi added 3 commits July 27, 2022 14:35

In progress

a9890ad

In progress

63c2c8e

Push upwards

79bf23f

CyrusNajmabadi commented Jul 27, 2022

View reviewed changes

CyrusNajmabadi added 2 commits July 27, 2022 14:57

docs

b592961

docs

bf52bca

CyrusNajmabadi commented Jul 27, 2022

View reviewed changes

CyrusNajmabadi added 2 commits July 27, 2022 15:02

docs

18dc3bd

docs

d158db6

CyrusNajmabadi commented Jul 27, 2022

View reviewed changes

docs

c88980d

CyrusNajmabadi commented Jul 27, 2022

View reviewed changes

CyrusNajmabadi added 3 commits July 27, 2022 15:32

Consistency

687096e

Proper assert

764b0b9

Remove check. we may be decrementing a solution owned by a different …

cb7aba1

…cache

dibarbet reviewed Jul 27, 2022

View reviewed changes

CyrusNajmabadi added 2 commits July 28, 2022 11:17

Simplify

738ff24

Simplify

49e077b

CyrusNajmabadi added 2 commits July 28, 2022 11:53

Update the last requested item when we get the item

2e98d98

Rename parameter for clarity

647bb10

CyrusNajmabadi mentioned this pull request Jul 28, 2022

Rename parameter for clarity #63023

Merged

Merge branch 'renameParameter' into oopCaching3

83afe05

CyrusNajmabadi closed this Jul 28, 2022

		@@ -3,18 +3,14 @@
		// See the LICENSE file in the project root for more information.

		@@ -0,0 +1,96 @@
		// Licensed to the .NET Foundation under one or more agreements.

Conversation

CyrusNajmabadi commented Jul 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

CyrusNajmabadi Jul 27, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

CyrusNajmabadi commented Jul 27, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

CyrusNajmabadi commented Jul 28, 2022

Uh oh!

CyrusNajmabadi commented Jul 28, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

CyrusNajmabadi commented Jul 22, 2022 •

edited

Loading

CyrusNajmabadi Jul 27, 2022 •

edited

Loading