Switch Windows 10 Console to UTF-8 Encoding#1416
Conversation
|
What is happening here seems rather safe to me, but it would require a more informed review. (There is in particular the question of whether we want this into 4.06.0). |
byterun/caml/osdeps.h
Outdated
| extern void caml_setup_terminal(void); | ||
| #else | ||
| #define caml_setup_terminal() | ||
| #endif |
There was a problem hiding this comment.
I have not seen this kind of extern-or-define-away trick before, and I wonder whether it was completely safe -- or if there is a risk of, say, link-time incompatibilities caused by a runtime compiled with the version and an application compiled without.
There was a problem hiding this comment.
This is an internal function only, so I think it's fine - you'd never be linking with the runtime in order to get this function.
There was a problem hiding this comment.
This is minor, but why not rename caml_probe_os_version and caml_setup_terminal in win32.c to caml_probe_win32_version and caml_setup_win32_terminal, declare those functions in osdeps.h (at the end of the file there is a part protected by #ifdef _WIN32), and move the #if's above to caml_sys_init below (without the #else part) ? Personally I find the flow easier to understand like that: it is clear that the code is only relevant for Win32 when looking at caml_sys_init and there is no need to use the extern-or-define-away idiom.
There was a problem hiding this comment.
I have no strong opinion, and given that you and @nojb are likely to be in charge of understanding this code for the years to come it makes sense to pick something that you are both comfortable with.
There was a problem hiding this comment.
I am not a big fan of the #define-function-name-to-nothing trick, and I agree with @nojb's suggestion (assuming I understand it correctly). Feel free to guard both declaration in osdeps.h and use in sys.c with #ifdef _WIN32. Possibly, caml_win32_setup_terminal should be defined in win32.c regardless of WINDOWS_UNICODE (and do nothing if it's not defined).
|
@nojb, @xavierleroy, @damiendoligez, I would be interested in your opinions on merging this part of the patch in the 4.06 release branch (independently from the larger #1408 that also deserves discussion, or of Xavier's comments on #1406 that are fairly orthogonal). I think that @dra27 would be pleased if this was included in time for beta2, and this is nice. Also, I have the impression that the kind of changes involved here, if they fail, would fail at the startup of any OCaml program, so rather easy to detect when @dra27 will do his magic testing of arcane Windows combinations before the release. |
|
For my part I will be reviewing this PR at some point today. |
|
A priori I'm in favor. I'll have minor comments on the implementation. But I'll wait for some reviewing on my recent PRs before doing more reviewing... |
|
LGTM, except for the small minor point about the |
|
@xavierleroy I think it would be nice to merge this in 4.06, so any additional comments would be very welcome. |
xavierleroy
left a comment
There was a problem hiding this comment.
Globally I'm happy with this code, and I support @nojb's suggested changes.
byterun/caml/osdeps.h
Outdated
| extern void caml_setup_terminal(void); | ||
| #else | ||
| #define caml_setup_terminal() | ||
| #endif |
There was a problem hiding this comment.
I am not a big fan of the #define-function-name-to-nothing trick, and I agree with @nojb's suggestion (assuming I understand it correctly). Feel free to guard both declaration in osdeps.h and use in sys.c with #ifdef _WIN32. Possibly, caml_win32_setup_terminal should be defined in win32.c regardless of WINDOWS_UNICODE (and do nothing if it's not defined).
| extern unsigned short caml_win32_minor; | ||
| extern unsigned short caml_win32_build; | ||
| extern unsigned short caml_win32_revision; | ||
|
|
There was a problem hiding this comment.
Presently, I don't think you're using these 4 variables outside of win32.c, so the principle of encapsulation says they should be static in win32.c and not declared in osdeps.h.
However, it is entirely plausible that we will later have other uses of those variables elsewhere, in which case the declaration makes sense.
There was a problem hiding this comment.
In the earlier version, they were genuinely extern as they were referred to in sys.c, but I envisaged their being useful to have globally - in fact, it's a mistake that I'd put them inside a CAML_INTERNALS block by mistake. May I make them a global part of osdeps.h (guarded by #ifdef _WIN32 - although in principle these could be made available to the Cygwin port as well)
| Windows which is running */ | ||
| caml_probe_os_version(); | ||
| caml_setup_terminal(); | ||
| #ifdef CAML_WITH_CPLUGINS |
There was a problem hiding this comment.
Guarding this with #ifdef _WIN32 would make @nojb and I happy, I think.
There was a problem hiding this comment.
This is now done - it makes me happy too!
| { | ||
| if (caml_win32_major >= 10) | ||
| SetConsoleOutputCP(CP_UTF8); | ||
| } |
There was a problem hiding this comment.
Maybe you want to guard with #ifdef WINDOWS_UNICODE. Or not?
There was a problem hiding this comment.
I think it's clearer for the #if WINDOWS_UNICODE to be sys.c? So the function always does this, but we only actually call it if WINDOWS_UNICODE=1
| @@ -115,8 +115,8 @@ LDFLAGS=/ENTRY:wmainCRTStartup | |||
| ### Libraries needed | |||
| #EXTRALIBS=bufferoverflowu.lib # for the old PSDK compiler only | |||
| EXTRALIBS= | |||
There was a problem hiding this comment.
The EXTRALIBS variable is now dead so you could remove those two lines entirely. Or are there other uses of EXTRALIBS in other Makefiles?
There was a problem hiding this comment.
Note: we discussed in the other PR with those commits, #1408 (comment), and it turns out EXTRALIBS is used in stdlib/Makefile.
There was a problem hiding this comment.
EXTRALIBS should be dead, but I didn't quite have the guts to remove it as part of this (it should really be that the person configuring OCaml adds them to the appropriate place in BYTECCLIBS, etc.). What I did have the guts (!!) to amend was the inconsistency between msvc64 and mingw64 where EXTRALIBS is concerned and also that byterun/Makefile manually specifies the libraries to link rather than correctly using BYTECCLIBS.
There was a problem hiding this comment.
OK, let's keep EXTRALIBS around for the moment.
|
Thanks for the reviews @nojb and @xavierleroy - I hope I've successfully address everything here. If those two new commits are OK, I'll rebase them into the previous ones. I think for the Unicode stuff, it is useful to have a clearly bisectable history. |
|
( @nojb, @xavierleroy, would one of you want to give this a nod of approval? ) |
byterun/unix.c
Outdated
| int caml_channel_flags(int fd) | ||
| { | ||
| return 0; | ||
| } |
There was a problem hiding this comment.
Is this change meant to be in this PR?
There was a problem hiding this comment.
It's the Unix version of the new caml_channel_flags - previously, it was specified using #define caml_channel_flags(fd) 0
There was a problem hiding this comment.
Sorry if I am missing something obvious, but I don't see any other change related to caml_channel_flags apart from this function definition in byterun/unix.c. Why is it necessary for this PR?
There was a problem hiding this comment.
Oh goodness, more ☕ required. You are correct, sorry - it's harmless, but this function should be in the other pull request 😊
|
Looks OK to me, except for the change to |
shell32 was needed at one point during development - it got removed from config/Makefile.m*, but not here.
Eliminate duplication between config/Makefile and byterun/Makefile
EXTRALIBS is largely a historic artefact anyway (it applies to a very old Platform SDK build) but EXTRALIBS is supposed to be manually included where needed, rather than embedded in BYTECCLIBS and NATIVECCLIBS.
Determine the running version of Windows and load the 4 components into caml_win32_major, caml_win32_minor, caml_win32_build and caml_win32_revision. Starting from Windows 8.1, Microsoft made the entirely flawed decision to start lying to processes about the version of Windows on which they're running which means that the classic GetVersion and GetVersionEx calls can no longer be used. The code here determines the Windows version by reading the version information block from kernel32.dll.
39477fb to
e43c169
Compare
|
Rebased - and erroneous update to |
For WINDOWS_UNICODE=1, enabled UTF-8 output on the Windows Console. In Windows 10 and later, this simply works. However, prior to Windows 10 we hit MPR#6925 because there is an error in the implementation of WriteConsole when CP_UTF8 is selected.
e43c169 to
bf1038c
Compare
|
Look OK now! Thanks! |
xavierleroy
left a comment
There was a problem hiding this comment.
If formal approval is needed, here it is!
|
Cherry-picked to 4.06 in 680556f |
Split off from #1408.