-
Notifications
You must be signed in to change notification settings - Fork 5.4k
Description
Update 6/16/17: Looking for volunteers
The API shape has been finalized. However, we're still deciding on the best hash algorithm out of a list of candidates to use for the implementation, and we need someone to help us measure the throughput/distribution of each algorithm. If you'd like to take that role up, please leave a comment below and @karelz will assign this issue to you.
Update 6/13/17: Proposal accepted!
Here's the API that was approved by @terrajobst at https://github.com/dotnet/corefx/issues/14354#issuecomment-308190321:
// Will live in the core assembly
// .NET Framework : mscorlib
// .NET Core : System.Runtime / System.Private.CoreLib
namespace System
{
public struct HashCode
{
public static int Combine<T1>(T1 value1);
public static int Combine<T1, T2>(T1 value1, T2 value2);
public static int Combine<T1, T2, T3>(T1 value1, T2 value2, T3 value3);
public static int Combine<T1, T2, T3, T4>(T1 value1, T2 value2, T3 value3, T4 value4);
public static int Combine<T1, T2, T3, T4, T5>(T1 value1, T2 value2, T3 value3, T4 value4, T5 value5);
public static int Combine<T1, T2, T3, T4, T5, T6>(T1 value1, T2 value2, T3 value3, T4 value4, T5 value5, T6 value6);
public static int Combine<T1, T2, T3, T4, T5, T6, T7>(T1 value1, T2 value2, T3 value3, T4 value4, T5 value5, T6 value6, T7 value7);
public static int Combine<T1, T2, T3, T4, T5, T6, T7, T8>(T1 value1, T2 value2, T3 value3, T4 value4, T5 value5, T6 value6, T7 value7, T8 value8);
public void Add<T>(T value);
public void Add<T>(T value, IEqualityComparer<T> comparer);
[Obsolete("Use ToHashCode to retrieve the computed hash code.", error: true)]
[EditorBrowsable(Never)]
public override int GetHashCode();
public int ToHashCode();
}
}The original text of this proposal follows.
Rationale
Generating a good hash code should not require use of ugly magic constants and bit twiddling on our code. It should be less tempting to write a bad-but-concise GetHashCode implementation such as
class Person
{
public override int GetHashCode() => FirstName.GetHashCode() + LastName.GetHashCode();
}Proposal
We should add a HashCode type to enscapulate hash code creation and avoid forcing devs to get mixed up in the messy details. Here is my proposal, which is based off of https://github.com/dotnet/corefx/issues/14354#issuecomment-305019329, with a few minor revisions.
// Will live in the core assembly
// .NET Framework : mscorlib
// .NET Core : System.Runtime / System.Private.CoreLib
namespace System
{
public struct HashCode
{
public static int Combine<T1>(T1 value1);
public static int Combine<T1, T2>(T1 value1, T2 value2);
public static int Combine<T1, T2, T3>(T1 value1, T2 value2, T3 value3);
public static int Combine<T1, T2, T3, T4>(T1 value1, T2 value2, T3 value3, T4 value4);
public static int Combine<T1, T2, T3, T4, T5>(T1 value1, T2 value2, T3 value3, T4 value4, T5 value5);
public static int Combine<T1, T2, T3, T4, T5, T6>(T1 value1, T2 value2, T3 value3, T4 value4, T5 value5, T6 value6);
public static int Combine<T1, T2, T3, T4, T5, T6, T7>(T1 value1, T2 value2, T3 value3, T4 value4, T5 value5, T6 value6, T7 value7);
public static int Combine<T1, T2, T3, T4, T5, T6, T7, T8>(T1 value1, T2 value2, T3 value3, T4 value4, T5 value5, T6 value6, T7 value7, T8 value8);
public void Add<T>(T value);
public void Add<T>(T value, IEqualityComparer<T> comparer);
public void AddRange<T>(T[] values);
public void AddRange<T>(T[] values, int index, int count);
public void AddRange<T>(T[] values, int index, int count, IEqualityComparer<T> comparer);
[Obsolete("Use ToHashCode to retrieve the computed hash code.", error: true)]
public override int GetHashCode();
public int ToHashCode();
}
}Remarks
See @terrajobst's comment at https://github.com/dotnet/corefx/issues/14354#issuecomment-305019329 for the goals of this API; all of his remarks are valid. I would like to point out these ones in particular, however:
- The API does not need to produce a strong cryptographic hash
- The API will provide "a" hash code, but not guarantee a particular hash code algorithm. This allows us to use a different algorithm later or use different algorithms on different architectures.
- The API will guarantee that within a given process the same values will yield the same hash code. Different instances of the same app will likely produce different hash codes due to randomization. This allows us to ensure that consumers cannot persist hash values and accidentally rely on them being stable across runs (or worse, versions of the platform).