Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase lock granularity for CommandInfo cache #1166

Merged
merged 3 commits into from Mar 13, 2019

Conversation

@rjmholt
Copy link
Member

rjmholt commented Mar 5, 2019

PR Summary

Introduces a concurrent dictionary allowing different caller threads to get CommandInfo objects for different commands without blocking each other.

Alternative implementation of #1162.
Closes #1162

More details:

  • Spins the commandinfo cache into its own object, so that the complexity of thread management is managed in one place
  • Uses a single concurrency primitive (ConcurrentDictionary<K, V>.GetOrAdd()), so very little opportunity to introduce problems later
  • Does not use tasks
  • Has the same performance characteristics as the task-based implementation

PR Checklist

return commandInfoCache[key];
}

var commandInfo = GetCommandInfoInternal(cmdletName, commandType);

This comment has been minimized.

@bergmeister

bergmeister Mar 5, 2019 Collaborator

If you look carefully, the name variable is being used for the creation of the key but the cmdletname variable is passed into this function. The name and cmdletname variable might not always be the same but you pass only name into the new GetCommandInfo() function. I think you'd have to pass the CommandLookupKey as well into it instead to preserve legacy behaviour. We have seen it very hard to make any change to this method without breaking something else, hence why this method is marked as legacy. Unless you can argue that they are always the same, I'd be careful changing it (even if tests are green)

@bergmeister
Copy link
Collaborator

bergmeister commented Mar 5, 2019

Thanks for the efforts, I see now what you mean, yes it looks more elegant to me, I didn't know one could do caching with lambda expressions this way as well (that's why I thought originally I had to use Task). I'd have preferred it if the changeset was kept smaller. Looks ok-ish but I have 1 comment about trying to preserve legacy behaviour to reduce the risk of a potential regression.

@rjmholt
Copy link
Member Author

rjmholt commented Mar 6, 2019

Thanks for the efforts, I see now what you mean, yes it looks more elegant to me, I didn't know one could do caching with lambda expressions this way as well (that's why I thought originally I had to use Task). I'd have preferred it if the changeset was kept smaller. Looks ok-ish but I have 1 comment about trying to preserve legacy behaviour to reduce the risk of a potential regression.

Fair points, I'll undo the unnecessary changes

@rjmholt rjmholt force-pushed the rjmholt:gcm-parallel-lock branch from 51d5ee6 to a1f94f5 Mar 6, 2019
/// <returns>Returns null if command does not exists</returns>
private static CommandInfo GetCommandInfoInternal(string cmdName, CommandTypes? commandType)
{
using (var ps = System.Management.Automation.PowerShell.Create())

This comment has been minimized.

@daxian-dbw

daxian-dbw Mar 8, 2019 Member

There is a chance to further improve the perf by using a static RunspacePool, instead of creating/disposing a Runspace for every call to this method.
Of course, you will need to decide the size of the pool carefully based on the typical scenarios.

The code would be something like this:

staitc RunspacePool rps = RunspaceFactory.CreateRunspacePool(minRunspaces: 1, maxRunspaces: 5)
...
private static CommandInfo GetCommandInfoInternal(string cmdName, CommandTypes? commandType)
{
    using (var ps = System.Management.Automation.PowerShell.Create())
    {
        ps.RunspacePool = rps;
        ps.AddCommand("Get-Command").Invoke()
        ...
    }
}

This comment has been minimized.

@bergmeister

bergmeister Mar 12, 2019 Collaborator

Yes, I have tried something like this already in the past but could not get any significant measurable differences in terms of time. I think because all rules are run in separate threads, the rule that takes the longest is the weakest link in the chain and therefore the bottleneck.

This comment has been minimized.

@daxian-dbw

daxian-dbw Mar 13, 2019 Member

Using a RunspacePool here is mainly for two purpose:

  1. Make this method run faster
    • Runspace.Open is expensive.
    • module auto-loading and the cache populated during command discovery can be reused
    • RunspacePool may cause some extent of synchronization among the threads (more threads than the number of Runspaces get in this method). But only one invocation will be done by each thread, so the introduced synchronization wouldn't be that bad.
  2. Reduce the GC pressure
    • Open and dispose a Runspace in every thread that gets in this method would produce a lot shot living objects.

As for how effective it is to the overall run time improvement, maybe not much -- measurement speaks the truth :) But it looks to me a low hanging fruit.

Engine/CommandInfoCache.cs Outdated Show resolved Hide resolved
Copy link
Collaborator

bergmeister left a comment

Thanks, looks good to me. I am happy to close my PR #1162 in favour of this

bergmeister and others added 2 commits Mar 13, 2019
Co-Authored-By: rjmholt <rjmholt@gmail.com>
@JamesWTruher JamesWTruher merged commit f34d956 into PowerShell:development Mar 13, 2019
2 checks passed
2 checks passed
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
license/cla All CLA requirements met.
Details
bergmeister added a commit to bergmeister/PSScriptAnalyzer that referenced this pull request Mar 22, 2019
* Redo command info cache changes

* Fix comment typo

Co-Authored-By: rjmholt <rjmholt@gmail.com>

* Trigger CI
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

4 participants
You can’t perform that action at this time.