Remove BinaryFormatter from GetSDKReferenceFiles#6324
Conversation
|
I would hardcode the translation logic for each type T that we need, and just write the count of key-value pairs, followed by each key and value separately. |
|
Can you explain why we need the counts? I haven't looked at how complete our test coverage of this task was, but it seems to pass whatever tests we previously had without it. |
|
I might be not understanding it well, but how are you going to read the keys and values if you don’t know how many there are? |
|
Looks like the answer is: there weren't tests. Makes it easy to pass them 😋 I was imagining something like looking for closing elements. In json, for instance, they have { "key1": "val1", "key2": "val2" }. You don't need a count because the opening and closing braces match. I don't know if there's an equivalent here, but I now think it just takes references and gets or sets them based on what sort of data you ask for, so since the dictionary would have started out full in serializing, it would serialize properly. Then, in deserialization, the dictionary would be null, and it would throw an exception. If I'd initialized the dictionary, it would still be empty and remain so. I added a count. I should probably add tests. |
src/Tasks/GetSDKReferenceFiles.cs
Outdated
| internal class SdkReferenceInfo | ||
| { | ||
| /// <summary> | ||
| /// Constructor |
There was a problem hiding this comment.
While you're getting rid of useless summaries, might as well delete this one.
| public ConcurrentDictionary<string, List<string>> DirectoryToFileList { get { return directoryToFileList; } } | ||
|
|
||
| /// <summary> | ||
| /// Hashset |
There was a problem hiding this comment.
nit: remove useless summary
There was a problem hiding this comment.
This has the bonus of deleting a summary that isn't actually accurate. Turns out if a mapping isn't in the dictionary, it doesn't look for it; it just assumes it isn't there.
src/Tasks/GetSDKReferenceFiles.cs
Outdated
| } | ||
|
|
||
| /// <summary> | ||
| /// Constructor |
There was a problem hiding this comment.
nit: remove useless summary
src/Tasks/GetSDKReferenceFiles.cs
Outdated
| /// A dictionary which maps a file path to a structure that contain some metadata information about that file. | ||
| /// </summary> | ||
| public ConcurrentDictionary<string, SdkReferenceInfo> PathToReferenceMetadata { get; } | ||
| internal ConcurrentDictionary<string, SdkReferenceInfo> PathToReferenceMetadata { get { return pathToReferenceMetadata; } } |
There was a problem hiding this comment.
Why is this changed from public to internal, but other properties untouched?
There was a problem hiding this comment.
I don't think this actually mattered, so I reverted it. See next comment for fiddling.
| /// This class represents the context information used by the background cache serialization thread. | ||
| /// </summary> | ||
| private class SaveContext | ||
| internal class SaveContext |
There was a problem hiding this comment.
Why did this need to change?
There was a problem hiding this comment.
I had to fiddle with a lot of these to make them accessible to the test class. Then I downgraded a couple because properties were more visible than their containing class, which isn't really good style, but the latter wasn't necessary. As long as something prevents it from being public/protected, I don't think it particularly matters.
There was a problem hiding this comment.
Ah, I hadn't reviewed the test so I didn't see any need for it.
src/Tasks/GetSDKReferenceFiles.cs
Outdated
| string[] keys = dictionary.Keys.ToArray(); | ||
| for (int i = 0; i < count; i++) | ||
| { | ||
| if (i < keys.Length) |
There was a problem hiding this comment.
If i < keys.Length once, it won't ever change right? For better at-a-glance-understanding™, I'd suggest creating some bool translatingOut = i < keys.Length and using that bool. Or place a comment.
There was a problem hiding this comment.
True. I put the if statement first, which should technically be more performant, too, though not in a particularly meaningful way.
src/Tasks/GetSDKReferenceFiles.cs
Outdated
| for (int i = 0; i < count; i++) | ||
| { | ||
| translator.Translate(ref keys[i]); | ||
| T value = dictionary[keys[i]]; |
There was a problem hiding this comment.
Can the dictionary change while the loop runs? If a value is removed, this may throw. Do you need to lock? Also perhaps its better to enumerate key value pairs to avoid looking every key up.
There was a problem hiding this comment.
No—before this PR, there was only the property, and it had a getter but no setter, so it really shouldn't be a ConcurrentDictionary. It's been a ConcurrentDictionary since the initial code commit. Do you want me to change it to an ordinary dictionary?
There was a problem hiding this comment.
up to you, if it feels safe
src/Tasks/GetSDKReferenceFiles.cs
Outdated
| int count = dictionary.Count; | ||
| translator.Translate(ref count); | ||
| string[] keys = dictionary.Keys.ToArray(); | ||
| if (keys.Length == 0) |
There was a problem hiding this comment.
is this effectively determining the translation direction (read/write)? If so, is there a more explicit way, perhaps asking the translator?
There was a problem hiding this comment.
Looks like ITranslator has a TranslationDirection Mode we can take advantage of here.
rainersigwald
left a comment
There was a problem hiding this comment.
Looks ok as is but I'm pretty curious if we can avoid the ConcurrentDictionary/Dictionary stuff.
src/Tasks/GetSDKReferenceFiles.cs
Outdated
| @@ -1246,63 +1218,79 @@ public SdkReferenceInfo(string fusionName, string imageRuntime, bool isWinMD, bo | |||
| /// Structure that contains the on disk representation of the SDK in memory. | |||
| /// </summary> | |||
| /// <remarks>This is a serialization format. Do not change member naming.</remarks> | |||
There was a problem hiding this comment.
| /// <remarks>This is a serialization format. Do not change member naming.</remarks> |
Stale
src/Tasks/GetSDKReferenceFiles.cs
Outdated
| PopulateRedistDictionaryFromPaths(directoryToFileList, redistDirectories); | ||
|
|
||
| var cacheInfo = new SDKInfo(references, directoryToFileList, FileUtilities.GetPathsHash(directoriesToHash)); | ||
| var cacheInfo = new SDKInfo(references.ToDictionary(kvp => kvp.Key, kvp => kvp.Value, StringComparer.OrdinalIgnoreCase), directoryToFileList.ToDictionary(kvp => kvp.Key, kvp => kvp.Value, StringComparer.OrdinalIgnoreCase), FileUtilities.GetPathsHash(directoriesToHash)); |
There was a problem hiding this comment.
Is there a perf impact of this synchronization? Could you instead store an IDictionary in the SDKInfo fields and on creation it's ConcurrentDictionary/on deserialization it's a plain dictionary?
src/Tasks/GetSDKReferenceFiles.cs
Outdated
| private Dictionary<string, SdkReferenceInfo> pathToReferenceMetadata; | ||
| private Dictionary<string, List<string>> directoryToFileList; | ||
| private int hash; |
There was a problem hiding this comment.
Our/runtime's standard naming convention would make these:
| private Dictionary<string, SdkReferenceInfo> pathToReferenceMetadata; | |
| private Dictionary<string, List<string>> directoryToFileList; | |
| private int hash; | |
| private Dictionary<string, SdkReferenceInfo> _pathToReferenceMetadata; | |
| private Dictionary<string, List<string>> _directoryToFileList; | |
| private int _hash; |
| info.IsWinMD = isWinmd; | ||
| }, count => new Dictionary<string, SdkReferenceInfo>(count, StringComparer.OrdinalIgnoreCase)); | ||
|
|
||
| translator.TranslateDictionary(ref _directoryToFileList, (ITranslator t, ref string s) => t.Translate(ref s), (ITranslator t, ref List<string> fileList) => |
There was a problem hiding this comment.
Funny that the duplicate translation worked because the same method handles input/output. Good thing it's not a huge concern and fairly easy to catch.
I wonder if a certain type of test would catch this? (not blocking on this thought)
There was a problem hiding this comment.
Right, but it should work as long as it matches. It serialized the dictionary twice, then deserialized it twice. Setting it the second time overwrote the correct value with the correct value, so everything still worked.
That also means I think it would be very hard to write a test for it...the only two options I can think of are having some kind of counter on how many times we accessed/set each variable or having a check at the end for the total size of the serialized form. Neither is trivial, and it isn't a huge deal for something that gets hit as often as this.
There was a problem hiding this comment.
Setting it the second time overwrote the correct value with the correct value, so everything still worked.
I see, I was thinking maybe there'd be duplicate data somehow. Agreed that it's fine as is.
I think this is almost right, but TranslatorHelpers needs System.Collections.Concurrent on .NET 3.5, or I need to be able to translate a concurrent dictionary without translating it. (?) Any idea on how I can get around that?