Skip to content

Translate OCAML_STDLIB_DIR correctly on Windows#1448

Merged
damiendoligez merged 1 commit intoocaml:trunkfrom
dra27:fix-ocamlstdlib-default
Oct 27, 2017
Merged

Translate OCAML_STDLIB_DIR correctly on Windows#1448
damiendoligez merged 1 commit intoocaml:trunkfrom
dra27:fix-ocamlstdlib-default

Conversation

@dra27
Copy link
Copy Markdown
Member

@dra27 dra27 commented Oct 25, 2017

See #1444.

Certainly for trunk. I agree with @gasche that describing this as build system change was a bit of a stretch. What is eliminated in byterun/dynlink.c is the copying of the string constant supplied by the build system, because the build system is altered to pass the string as UTF-8 in the first place.

The change eliminates a compiler warning emitted by cl if OCaml is compiled in a directory which includes UTF-8 sequences.

@gasche
Copy link
Copy Markdown
Member

gasche commented Oct 25, 2017

At first I thought that you assumed an environment variable to be unicode, which I found very confusing. But this is actually a macro defined by passing a -DFOO=BAR option to the C compiler invocation, and you changed BAR from "string" to L"string" under Windows.

Then a natural question is: would it make more sense to do this consistently for C macros? As far as I can see, the only other macro in that category is RUNTIME_NAME, and there are cases where it is defined with funky stuff inside ($(BINDIR) or $(TARGET_BINDIR)), but only under Unix systems. Still, wouldn't it make also pass RUNTIME_NAME as a L-string under Windows for consistency?

@shindere
Copy link
Copy Markdown
Contributor

shindere commented Oct 25, 2017 via email

Copy link
Copy Markdown
Contributor

@xavierleroy xavierleroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Impressive scripting kung-fu, as usual.

I agree iconv can be assumed to be installed, but I'm worried about the JAVA encoding. On my Ubuntu 16.04 machine, iconv knows more than 1000 encodings but not any encoding named JAVA.

@xavierleroy
Copy link
Copy Markdown
Contributor

Post-scriptum: the iconv in Cygwin is from a different source than the iconv in Ubuntu, so maybe -t JAVA is OK after all. For reference, in Cygwin we have

$ iconv --version
iconv (GNU libiconv 1.14)
Copyright (C) 2000-2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by Bruno Haible.

and in Ubuntu we have

$ iconv --version
iconv (Ubuntu GLIBC 2.26-0ubuntu2) 2.26
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Ulrich Drepper.

@dra27
Copy link
Copy Markdown
Member Author

dra27 commented Oct 25, 2017

From a quick perusal of the two repos, GNU libiconv has had -t JAVA since version 0.3 (1999 - and the oldest commit available in its git repo). AFAICT (I don't understand the politics/history for why there are two), glibc has its own entirely independent implementation of iconv which appears not to have either C99 or Java "encodings".

Cygwin relies (and always will rely) on newlib, not glibc, so I think this should be fine. If that changed, we'dI'd just have to write a script...!

@dra27 dra27 force-pushed the fix-ocamlstdlib-default branch from 74f2b1e to 6267424 Compare October 25, 2017 20:39
@gasche
Copy link
Copy Markdown
Member

gasche commented Oct 26, 2017

(no opinion on RUNTIME_NAME?)

@gasche gasche added this to the consider-for-4.06.0 milestone Oct 26, 2017
@dra27
Copy link
Copy Markdown
Member Author

dra27 commented Oct 26, 2017

@gasche - sorry, I forgot to look at RUNTIME_NAME last night. I don't think we should change this one, at least not for consistency reasons. RUNTIME_NAME is not translated to a Unicode string (at least directly) in headernt.c - it's used to search PATH.

The only time this could benefit is if we wanted to have non-ASCII characters in the name of the runtime, which would be unlikely - if my understanding of headernt.c is fully correct, passing a full path in RUNTIME_NAME would not work at all on Windows and I can't imagine a situation where we'd want to extend it to do this. Hard-coding binary locations is an unspeakably awful thing to do on Windows: although on most computers, Windows is installed in C:\Windows, even that is not compulsory and the default directory in which programs are installed, C:\Program Files, is a localised string (its location should be properly read from $PROGRAMFILES, as we do in our scripts.

@dra27 dra27 force-pushed the fix-ocamlstdlib-default branch from 6267424 to ddaeb48 Compare October 26, 2017 08:33
@gasche
Copy link
Copy Markdown
Member

gasche commented Oct 26, 2017

That's very clear, thanks for the explanation. I think that my consistency-itch could be resolved by explaining it away: you could have a comment at the place where OCAML_STDLIB_DIR is defined, or at the place where it is used, or both, to note that it is passed as a L-string but that other C configuration variables may not.

@gasche
Copy link
Copy Markdown
Member

gasche commented Oct 26, 2017

(While you are at it, there could be a comment around the iconv call on which versions have this JAVA charset, so that Xavier's surprise gets recorded for eternity.)

The build generates invalid code in byterun/dynlink.c if LIBDIR contains
non-1252 characters. Use iconv to translate LIBDIR to JAVA which, unlike
C99 will produce only \u sequences with no \U sequences. The \u is then
translated to \x which is the only form recognised pre-Visual Studio 2013.
@dra27 dra27 force-pushed the fix-ocamlstdlib-default branch from ddaeb48 to 0739677 Compare October 26, 2017 09:04
@dra27
Copy link
Copy Markdown
Member Author

dra27 commented Oct 26, 2017

That's a good point - I tend to err too much towards commit messages, rather than comments. I've put a comment at the definition of OCAML_STDLIB_DIR in byterun/Makefile explaining why and how.

@shindere
Copy link
Copy Markdown
Contributor

shindere commented Oct 26, 2017 via email

@damiendoligez damiendoligez merged commit 3b434eb into ocaml:trunk Oct 27, 2017
gasche pushed a commit that referenced this pull request Oct 27, 2017
The build generates invalid code in byterun/dynlink.c if LIBDIR contains
non-1252 characters. Use iconv to translate LIBDIR to JAVA which, unlike
C99 will produce only \u sequences with no \U sequences. The \u is then
translated to \x which is the only form recognised pre-Visual Studio 2013.
(cherry picked from commit 3b434eb)
@gasche
Copy link
Copy Markdown
Member

gasche commented Oct 27, 2017

Cherry-picked in 4.06 : 7f21a3f.

@dra27 dra27 deleted the fix-ocamlstdlib-default branch July 6, 2021 14:05
EmileTrotignon pushed a commit to EmileTrotignon/ocaml that referenced this pull request Jan 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants