Skip to content

Switch Windows 10 Console to UTF-8 Encoding#1416

Merged
dra27 merged 5 commits intoocaml:trunkfrom
dra27:windows10-unicode
Oct 12, 2017
Merged

Switch Windows 10 Console to UTF-8 Encoding#1416
dra27 merged 5 commits intoocaml:trunkfrom
dra27:windows10-unicode

Conversation

@dra27
Copy link
Copy Markdown
Member

@dra27 dra27 commented Oct 8, 2017

Split off from #1408.

@gasche
Copy link
Copy Markdown
Member

gasche commented Oct 8, 2017

What is happening here seems rather safe to me, but it would require a more informed review. (There is in particular the question of whether we want this into 4.06.0).

extern void caml_setup_terminal(void);
#else
#define caml_setup_terminal()
#endif
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not seen this kind of extern-or-define-away trick before, and I wonder whether it was completely safe -- or if there is a risk of, say, link-time incompatibilities caused by a runtime compiled with the version and an application compiled without.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an internal function only, so I think it's fine - you'd never be linking with the runtime in order to get this function.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is minor, but why not rename caml_probe_os_version and caml_setup_terminal in win32.c to caml_probe_win32_version and caml_setup_win32_terminal, declare those functions in osdeps.h (at the end of the file there is a part protected by #ifdef _WIN32), and move the #if's above to caml_sys_init below (without the #else part) ? Personally I find the flow easier to understand like that: it is clear that the code is only relevant for Win32 when looking at caml_sys_init and there is no need to use the extern-or-define-away idiom.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nojb - I'd be happy with that. @gasche - this came about through trying to step back from too much Windows-specific stuff in sys.c. Did I go too far? 🙂

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no strong opinion, and given that you and @nojb are likely to be in charge of understanding this code for the years to come it makes sense to pick something that you are both comfortable with.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a big fan of the #define-function-name-to-nothing trick, and I agree with @nojb's suggestion (assuming I understand it correctly). Feel free to guard both declaration in osdeps.h and use in sys.c with #ifdef _WIN32. Possibly, caml_win32_setup_terminal should be defined in win32.c regardless of WINDOWS_UNICODE (and do nothing if it's not defined).

@gasche
Copy link
Copy Markdown
Member

gasche commented Oct 9, 2017

@nojb, @xavierleroy, @damiendoligez, I would be interested in your opinions on merging this part of the patch in the 4.06 release branch (independently from the larger #1408 that also deserves discussion, or of Xavier's comments on #1406 that are fairly orthogonal).

I think that @dra27 would be pleased if this was included in time for beta2, and this is nice. Also, I have the impression that the kind of changes involved here, if they fail, would fail at the startup of any OCaml program, so rather easy to detect when @dra27 will do his magic testing of arcane Windows combinations before the release.

@nojb
Copy link
Copy Markdown
Contributor

nojb commented Oct 9, 2017

For my part I will be reviewing this PR at some point today.

@xavierleroy
Copy link
Copy Markdown
Contributor

A priori I'm in favor. I'll have minor comments on the implementation. But I'll wait for some reviewing on my recent PRs before doing more reviewing...

@nojb
Copy link
Copy Markdown
Contributor

nojb commented Oct 9, 2017

LGTM, except for the small minor point about the #ifdef's.

@gasche
Copy link
Copy Markdown
Member

gasche commented Oct 10, 2017

@xavierleroy I think it would be nice to merge this in 4.06, so any additional comments would be very welcome.

Copy link
Copy Markdown
Contributor

@xavierleroy xavierleroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Globally I'm happy with this code, and I support @nojb's suggested changes.

extern void caml_setup_terminal(void);
#else
#define caml_setup_terminal()
#endif
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a big fan of the #define-function-name-to-nothing trick, and I agree with @nojb's suggestion (assuming I understand it correctly). Feel free to guard both declaration in osdeps.h and use in sys.c with #ifdef _WIN32. Possibly, caml_win32_setup_terminal should be defined in win32.c regardless of WINDOWS_UNICODE (and do nothing if it's not defined).

extern unsigned short caml_win32_minor;
extern unsigned short caml_win32_build;
extern unsigned short caml_win32_revision;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presently, I don't think you're using these 4 variables outside of win32.c, so the principle of encapsulation says they should be static in win32.c and not declared in osdeps.h.

However, it is entirely plausible that we will later have other uses of those variables elsewhere, in which case the declaration makes sense.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the earlier version, they were genuinely extern as they were referred to in sys.c, but I envisaged their being useful to have globally - in fact, it's a mistake that I'd put them inside a CAML_INTERNALS block by mistake. May I make them a global part of osdeps.h (guarded by #ifdef _WIN32 - although in principle these could be made available to the Cygwin port as well)

Windows which is running */
caml_probe_os_version();
caml_setup_terminal();
#ifdef CAML_WITH_CPLUGINS
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guarding this with #ifdef _WIN32 would make @nojb and I happy, I think.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now done - it makes me happy too!

{
if (caml_win32_major >= 10)
SetConsoleOutputCP(CP_UTF8);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you want to guard with #ifdef WINDOWS_UNICODE. Or not?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's clearer for the #if WINDOWS_UNICODE to be sys.c? So the function always does this, but we only actually call it if WINDOWS_UNICODE=1

@@ -115,8 +115,8 @@ LDFLAGS=/ENTRY:wmainCRTStartup
### Libraries needed
#EXTRALIBS=bufferoverflowu.lib # for the old PSDK compiler only
EXTRALIBS=
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The EXTRALIBS variable is now dead so you could remove those two lines entirely. Or are there other uses of EXTRALIBS in other Makefiles?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: we discussed in the other PR with those commits, #1408 (comment), and it turns out EXTRALIBS is used in stdlib/Makefile.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EXTRALIBS should be dead, but I didn't quite have the guts to remove it as part of this (it should really be that the person configuring OCaml adds them to the appropriate place in BYTECCLIBS, etc.). What I did have the guts (!!) to amend was the inconsistency between msvc64 and mingw64 where EXTRALIBS is concerned and also that byterun/Makefile manually specifies the libraries to link rather than correctly using BYTECCLIBS.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, let's keep EXTRALIBS around for the moment.

@damiendoligez damiendoligez added this to the 4.06.0 milestone Oct 11, 2017
@dra27
Copy link
Copy Markdown
Member Author

dra27 commented Oct 11, 2017

Thanks for the reviews @nojb and @xavierleroy - I hope I've successfully address everything here. If those two new commits are OK, I'll rebase them into the previous ones. I think for the Unicode stuff, it is useful to have a clearly bisectable history.

@gasche
Copy link
Copy Markdown
Member

gasche commented Oct 12, 2017

( @nojb, @xavierleroy, would one of you want to give this a nod of approval? )

byterun/unix.c Outdated
int caml_channel_flags(int fd)
{
return 0;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change meant to be in this PR?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the Unix version of the new caml_channel_flags - previously, it was specified using #define caml_channel_flags(fd) 0

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if I am missing something obvious, but I don't see any other change related to caml_channel_flags apart from this function definition in byterun/unix.c. Why is it necessary for this PR?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh goodness, more ☕ required. You are correct, sorry - it's harmless, but this function should be in the other pull request 😊

@nojb
Copy link
Copy Markdown
Contributor

nojb commented Oct 12, 2017

Looks OK to me, except for the change to byterun/unix.c that does not seem to belong here.

dra27 added 4 commits October 12, 2017 15:06
shell32 was needed at one point during development - it got removed from
config/Makefile.m*, but not here.
Eliminate duplication between config/Makefile and byterun/Makefile
EXTRALIBS is largely a historic artefact anyway (it applies to a very
old Platform SDK build) but EXTRALIBS is supposed to be manually
included where needed, rather than embedded in BYTECCLIBS and
NATIVECCLIBS.
Determine the running version of Windows and load the 4 components into
caml_win32_major, caml_win32_minor, caml_win32_build and
caml_win32_revision.

Starting from Windows 8.1, Microsoft made the entirely flawed decision
to start lying to processes about the version of Windows on which
they're running which means that the classic GetVersion and GetVersionEx
calls can no longer be used.

The code here determines the Windows version by reading the version
information block from kernel32.dll.
@dra27 dra27 force-pushed the windows10-unicode branch from 39477fb to e43c169 Compare October 12, 2017 14:11
@dra27
Copy link
Copy Markdown
Member Author

dra27 commented Oct 12, 2017

Rebased - and erroneous update to byterun/unix.c removed

For WINDOWS_UNICODE=1, enabled UTF-8 output on the Windows Console. In
Windows 10 and later, this simply works. However, prior to Windows 10 we
hit MPR#6925 because there is an error in the implementation of
WriteConsole when CP_UTF8 is selected.
@dra27 dra27 force-pushed the windows10-unicode branch from e43c169 to bf1038c Compare October 12, 2017 14:12
@nojb
Copy link
Copy Markdown
Contributor

nojb commented Oct 12, 2017

Look OK now! Thanks!

Copy link
Copy Markdown
Contributor

@xavierleroy xavierleroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If formal approval is needed, here it is!

@dra27 dra27 merged commit 1a8fff4 into ocaml:trunk Oct 12, 2017
@dra27
Copy link
Copy Markdown
Member Author

dra27 commented Oct 12, 2017

Cherry-picked to 4.06 in 680556f

@dra27 dra27 deleted the windows10-unicode branch July 6, 2021 14:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants