Fix surrogate pair handling on Windows#1165
Conversation
|
This patch mangles mismatched high/low surrogates. Admittedly this is a corner case and indicates buggy software somewhere anyway, so if there’d be a concrete performance hazard to doing it that way I could live without it, but I prefer the style of the patch I wrote for gVim: on a normal character, flush any pending high surrogate, and on a low surrogate with no pending high surrogate, send the low surrogate. |
|
Ah yeah, that first case is a problem. I'll go ahead and address that. I've gotta ask, though - why would it be a good idea to send the mismatched low surrogate? There's nothing sensible you can do with that data, and the Unicode standard explicitly says not to encode surrogate pairs if you aren't UTF-16. |
|
Mismatched surrogates are exceptional, when you have a buggy IME. I don’t particularly expect to encounter them in the wild, but I’m confident that some users will encounter them at some point in time, and I prefer to pass through bad input rather than swallowing it or turning into even worse input. That way you can build tools like gVim atop it, and receive the mismatched surrogates and do something sane with them—or at least inspect what they are. I have just realised, however, that |
|
Actually, now that you bring that up, I have no idea why we're using a transmute there. The standard library provides perfectly good char conversion methods so we really should be using those. |
cargo fmthas been run on this branchCHANGELOG.mdif knowledge of this change could be valuable to usersShould fix #1164 and alacritty/alacritty#2796.