Skip to content

[mono] Use unsigned char when computing UTF8 string hashes#21632

Merged
akoeplinger merged 1 commit intomono:mainfrom
lambdageek:fix-gh-runtime-82187
Mar 13, 2023
Merged

[mono] Use unsigned char when computing UTF8 string hashes#21632
akoeplinger merged 1 commit intomono:mainfrom
lambdageek:fix-gh-runtime-82187

Conversation

@lambdageek
Copy link
Member

@lambdageek lambdageek commented Mar 11, 2023

Backport of dotnet/runtime#83273 to mono/mono main

The C standard does not specify whether char is signed or unsigned, it is implementation defined.

Apparently Android aarch64 makes a different choice than other platforms (at least macOS arm64 and Windows x64 give different results).

Mono uses mono_metadata_str_hash in the AOT compiler and AOT runtime to optimize class name lookup. As a result, classes whose names include UTF-8 continuation bytes (with the high bit = 1) will hash differently in the AOT compiler and on the device.

Contributes to dotnet/runtime#82187
Contributes to dotnet/runtime#78638

The C standard does not specify whether `char` is signed or unsigned,
it is implementation defined.

Apparently Android aarch64 makes a different choice than other
platforms (at least macOS arm64 and Windows x64 give different
results).

Mono uses `mono_metadata_str_hash` in the AOT compiler and AOT runtime
to optimize class name lookup.  As a result, classes whose names
include UTF-8 continuation bytes (with the high bit = 1) will hash
differently in the AOT compiler and on the device.

Fixes dotnet/runtime#82187
Fixes dotnet/runtime#78638
@lambdageek
Copy link
Member Author

/backport to 2020-02

@github-actions
Copy link
Contributor

Started backporting to 2020-02: https://github.com/mono/mono/actions/runs/4392747411

@akoeplinger akoeplinger merged commit ef0450d into mono:main Mar 13, 2023
ThomasKuehne pushed a commit to ThomasKuehne/mono that referenced this pull request Mar 23, 2024
)

Backport of dotnet/runtime#83273 to mono/mono `main`

The C standard does not specify whether `char` is signed or unsigned, it is implementation defined.

Apparently Android aarch64 makes a different choice than other platforms (at least macOS arm64 and Windows x64 give different results).

Mono uses `mono_metadata_str_hash` in the AOT compiler and AOT runtime to optimize class name lookup.  As a result, classes whose names include UTF-8 continuation bytes (with the high bit = 1) will hash differently in the AOT compiler and on the device.

Contributes to dotnet/runtime#82187 
Contributes to dotnet/runtime#78638
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants