Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Improve CryptoConfig.CreateFromName performance (from 10% to x20)#40184

Closed
adamsitnik wants to merge 6 commits intodotnet:masterfrom
adamsitnik:createCryptoConfig
Closed

Improve CryptoConfig.CreateFromName performance (from 10% to x20)#40184
adamsitnik wants to merge 6 commits intodotnet:masterfrom
adamsitnik:createCryptoConfig

Conversation

@adamsitnik
Copy link
Member

I was working on making it easier for CoreFX team members to use ASP.NET Perf Lab to run performance tests under high load.

Then I saw #39600 which was a perfect candidate for sample for the docs: we had some nasty locks that were making the perf bad. So I created sample docs based on this example: aspnet/Benchmarks#1284

After taking a brief look at the existing code, I could not resist rewriting it and make it even faster than @VladimirKhvostov did in #39600

ASP.NET Perf lab test result (described in aspnet/Benchmarks#1284):

Before:

RequestsPerSecond:           43,330
Max CPU (%):                 95
WorkingSet (MB):             166
Avg. Latency (ms):           5.91

flamegraph_filtered

After:

RequestsPerSecond:           120,801
Max CPU (%):                 90
WorkingSet (MB):             166
Avg. Latency (ms):           2.27

flamegraph_filtered_after

Some microbenchmark results:

Method Toolchain name Mean Ratio Allocated Memory/Op
CreateFromName net48 RSA 765.4 ns 1.00 802 B
CreateFromName netcoreapp3.0 RSA 3,681.8 ns 4.81 688 B
CreateFromName this PR RSA 278.1 ns 0.36 192 B
CreateFromName net48 SHA512 605.7 ns 1.00 1316 B
CreateFromName netcoreapp3.0 SHA512 770.7 ns 1.27 424 B
CreateFromName this PR SHA512 545.3 ns 0.90 128 B
CreateFromName net48 X509Chain 4,188.9 ns 1.00 682 B
CreateFromName netcoreapp3.0 X509Chain 3,269.7 ns 0.78 464 B
CreateFromName this PR X509Chain 228.1 ns 0.05 88 B

@danmosemsft this was one of the cases where Full .NET Framework was faster than Core. I made sure that Core is faster now.

The key changes:

  • remove the locks, use ConcurrentDictionary instead
  • Type.GetType(string name) is very expensive. Every type is now loaded once and then saved to the dictionary. This was the biggest improvement
  • one dictionary for name to name mapping, one for name to type mapping (previously there was one for string to object)
  • Fast path for creating config without providing any extra arguments

@karelz karelz added this to the 5.0 milestone Aug 9, 2019
@stephentoub
Copy link
Member

@adamsitnik, how do your changes relate to #39600? I've not reviewed yours, but from the description it seems duplicative.

@GrabYourPitchforks
Copy link
Member

Are any real-world E2E scenarios improved by this change? I'd be surprised to learn that this function is the bottleneck in any legitimate use of cryptography.

@VladimirKhvostov
Copy link
Contributor

@GrabYourPitchforks this method is used in the JWT token validation code path, which caused several LSIs in other DevOps. See #39600 for details.

@GrabYourPitchforks
Copy link
Member

@VladimirKhvostov I saw that, but it seems the conclusion from that thread was to change the JWT verification implementation to call a different method. To an extent this matches recommended practice: the CryptoConfig class is really intended for use only when you need agility. JWT verification shouldn't be bouncing through this class because the JWT validation library should be hardcoding the list of supported algorithms; it doesn't require the type of agility the CryptoConfig class is intended to provide.

@danmoseley
Copy link
Member

cc @krwq

@krwq krwq self-assigned this Aug 9, 2019
@krwq
Copy link
Member

krwq commented Aug 9, 2019

@adamsitnik thanks! I think next steps would be to not call CryptoConfig.CreateFromName at all and just use algorithm directly. That would avoid any lookup at all


string[] algorithmNames = new string[names.Length];
Array.Copy(names, 0, algorithmNames, 0, algorithmNames.Length);
string[] safeCopy = names.Length == 1 ? new string[1] { names[0] } : names.ToArray();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does new string[1] { names[0] } actually make that difference? If so, could we perhaps improve ToArray instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect CryptoConfig.AddAlgorithm most often called with 1 argument. We can either special case this - not create safeCopy if names.Length == 1 or even create an overload AddAlgorithm, which would take string name instead string names[].

Copy link
Member

@stephentoub stephentoub Aug 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely not worth extra code/complexity. Just do:

string[] algorithmNAmes = names.AsSpan().ToArray();

It's simpler and its perf will be just fine; I cannot believe any perf difference will be measurable in real-world usage, especially considering we're about to do cryptography!

(If you really believed Length == 1 was special and warranted optimization, you'd want to special case everything for the rest of the method, too; there's no point allocating another array, looping with foreach twice, etc., when you can just operate on the sole value. But, again, I sincerely doubt it's worth the extra complexity.)

typeNameToType["http://www.w3.org/2001/04/xmlenc#sha512"] = SHA512DefaultType;


typeNameToType["http://www.w3.org/2001/04/xmlenc#aes128-cbc"] = RijndaelManagedType;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bartonjs should we be using AesManagedType directly here?

@krwq
Copy link
Member

krwq commented Aug 9, 2019

I have some security concerns about this fix (and also original PR) - will need to discuss it with @bartonjs since he's more familiar with how this is being usage. I'll take this offline

Copy link
Member

@krwq krwq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See offline thread

{
ConcurrentDictionary<string, Type> typeNameToType = new ConcurrentDictionary<string, Type>(StringComparer.OrdinalIgnoreCase);

Type HMACMD5Type = typeof(System.Security.Cryptography.HMACMD5);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to get the type right next to its use. Fetching all types first into locals will result in larger stack frame and bigger native code.

@bartonjs
Copy link
Member

#39600 has been accepted. If there're any more gain to be gotten from this PR it would need to be re-evaluated in terms of the new baseline. Since it'd be a different change with different benchmark justification, I'm closing this PR.

@bartonjs bartonjs closed this Aug 13, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants