✨ Implement SegmentedHashSet<T>#54574
Conversation
These files are based on dotnet/runtime@556582d9 (v5.0.7).
… and non-randomized string comparers
|
@dotnet/roslyn-compiler for reviews |
|
@dotnet/roslyn-compiler for reviews |
1 similar comment
|
@dotnet/roslyn-compiler for reviews |
|
@dotnet/roslyn-compiler for reviews |
|
|
||
| public int GetHashCode(SegmentedHashSet<T>? obj) | ||
| { | ||
| var hashCode = 0; // default to 0 for null/empty set |
There was a problem hiding this comment.
Would System.HashCode be a good fit here here?
There was a problem hiding this comment.
➡️ This implementation is taken from the link at the top of the file, and only changed where necessary to build in Roslyn. The original code for this method does not use System.HashCode, so this question is probably better for dotnet/runtime.
| { | ||
| if (collection == null) | ||
| { | ||
| ThrowHelper.ThrowArgumentNullException(ExceptionArgument.collection); |
There was a problem hiding this comment.
Why this is not nameof(collection) (applies to all ExceptionArgument usages)
There was a problem hiding this comment.
➡️ This implementation is taken from the link at the top of the file, and only changed where necessary to build in Roslyn. The original code for this method does not use nameof(collection), so this question is probably better for dotnet/runtime.
| } | ||
|
|
||
| // Equals method for the comparer itself. | ||
| public override bool Equals(object? obj) => obj is SegmentedHashSetEqualityComparer<T>; |
There was a problem hiding this comment.
| public override bool Equals(object? obj) => obj is SegmentedHashSetEqualityComparer<T>; | |
| public override bool Equals([NotNullWhen(true)] object? obj) => obj is SegmentedHashSetEqualityComparer<T>; |
This was added in dotnet/runtime:
Note: requires using System.Diagnostics.CodeAnalysis;
There was a problem hiding this comment.
➡️ This implementation is taken from the link at the top of this file. The 5.0.7 tag does not contain this attribute; it will be added during the update step if/when a newer version is adopted.
| } | ||
| else if (ReferenceEquals(_comparer, StringComparer.OrdinalIgnoreCase)) | ||
| { | ||
| _comparer = (IEqualityComparer<T>)NonRandomizedStringEqualityComparer.WrappedAroundStringComparerOrdinalIgnoreCase; |
There was a problem hiding this comment.
Why were these special cases eliminated?
There was a problem hiding this comment.
They require porting a huge amount of platform-specific culture code. The same change was adopted for SegmentedDictionary<TKey, TValue>.
|
It looks like the important commit here is c0c39b1. |
|
@dotnet/roslyn-compiler for second review |
| return false; | ||
| } | ||
|
|
||
| var defaultComparer = EqualityComparer<T>.Default; |
There was a problem hiding this comment.
Could move this to line 33 to avoid .Default on unnecessary code paths
There was a problem hiding this comment.
I would typically avoid this primarily so we can keep the file as close to the reference source as possible (i.e. if the code changes in dotnet/runtime, we would adopt that change in the next update here per the link in the comment at the top of the file).
| } | ||
| } | ||
|
|
||
| return true; |
There was a problem hiding this comment.
What if x has every element in y and one additional element? That will pass the test above and fall through to return true even though the x has more elements.
There was a problem hiding this comment.
I'm not sure what the expected behavior here. This is not documented on docs.microsoft.com for HashSet<T>.CreateSetComparer(), and is the same behavior seen here:
https://github.com/dotnet/runtime/blob/6ca8c9bc0c4a5fc1082c690b6768ab3be8761b11/src/libraries/System.Private.CoreLib/src/System/Collections/Generic/HashSetEqualityComparer.cs#L25-L51
There was a problem hiding this comment.
This just seems like a bug in the original source code.
There was a problem hiding this comment.
Can we reference that bug here? If they decide to fix that in their source we can follow up with a fix here.
This is a drop-in replacement for
HashSet<T>, with equivalent semantics and algorithmic complexity but does not require the use of the Large Object Heap.