Fix and improve Unicode escape sequence info (C#) #13162
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Remove erroneous note regarding
\Ubeing used for specifying surrogate pairs. That note was patently false given that a) specifying a surrogate pair results in a compiler error, and b) specifying any valid code point / UTF-32 code unit returns the correct Unicode character for that code point.\Uescape can also be used for BMP characters.\U0001F47E, and its surrogate pair via\UD83DDC7Edoes not, on IDE One"\U" Unicode escape sequence for strings accepts invalid value instead of raising error #15456
Correctly indicated that
\Uis for a 4-byte UTF-32 value, and\uis for a 2-byte UTF-16 value.Show the pattern and an example to be more readable / helpful. Please note that
\U00nnnnnnhas two permanent zeros and only 6 user-supplied hex digits. This is not only being completely honest (since those first two zeros can only ever be zeros), it removes any possibility of interpreting the 8 hex digits as being for a surrogate pair (which can never start with two zeros), hence reducing confusion.Properly formatted escape sequences as being inline-code
Added warning about using
\xescape with less than 4 hex digits. For more info on this, please see:Unicode Escape Sequences Across Various Languages and Platforms (including Supplementary Characters)