Skip to content

Commit 3cb47d8

Browse files
[mono] Use unsigned char when computing UTF8 string hashes (#21633)
Backport of dotnet/runtime#83273 to mono/mono `2020-02` The C standard does not specify whether `char` is signed or unsigned, it is implementation defined. Apparently Android aarch64 makes a different choice than other platforms (at least macOS arm64 and Windows x64 give different results). Mono uses `mono_metadata_str_hash` in the AOT compiler and AOT runtime to optimize class name lookup. As a result, classes whose names include UTF-8 continuation bytes (with the high bit = 1) will hash differently in the AOT compiler and on the device. Contributes to dotnet/runtime#82187 Contributes to dotnet/runtime#78638
1 parent a102a35 commit 3cb47d8

File tree

2 files changed

+3
-2
lines changed

2 files changed

+3
-2
lines changed

mono/eglib/ghashtable.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -673,7 +673,7 @@ guint
673673
g_str_hash (gconstpointer v1)
674674
{
675675
guint hash = 0;
676-
char *p = (char *) v1;
676+
unsigned char *p = (unsigned char *) v1;
677677

678678
while (*p++)
679679
hash = (hash << 5) - (hash + *p);

mono/metadata/metadata.c

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5532,7 +5532,8 @@ guint
55325532
mono_metadata_str_hash (gconstpointer v1)
55335533
{
55345534
/* Same as g_str_hash () in glib */
5535-
char *p = (char *) v1;
5535+
/* note: signed/unsigned char matters - we feed UTF-8 to this function, so the high bit will give diferent results if we don't match. */
5536+
unsigned char *p = (unsigned char *) v1;
55365537
guint hash = *p;
55375538

55385539
while (*p++) {

0 commit comments

Comments
 (0)