Enable force-safe-string by default#1859
Conversation
783a112 to
6eaceb1
Compare
6eaceb1 to
7b3f812
Compare
|
There is an intermediary knob that we could turn before the one you propose, which is to enable Having If we release a version of OCaml with your more radical patch, there is no way to revert back to |
|
P.S.: If some people want more context about the configure-time flags for string mutability, the |
|
@gasche I don't really understand you "runtime change" arguments. The changes here are really minimal (less than 20 lines there were already there and used anyway) and I don't see how it could go wrong considering the amount of people already using 4.06 (and soon 4.07) already. But I guess I can make another PR for what you are suggesting if you really want to, but apart from a few days delay to fix an hypothetical problem I don't really see the point, personally (but maybe I'm missing something in the picture) |
|
Apart from that, does anybody have any idea why the CIs are failing ? |
|
I encourage the efforts to move towards the removal of the legacy mode. But since the current default still allows for enabling unsafe-string at compile-time, there was no real strong pressure for projects to switch to it, it was just a matter of locally tweaking the build system of projects not ready for safe-strings. A version where the default configuration prevents building projects not prepared for safe-strings will apply a much greater pressure to authors to upgrade their code, while still avoiding the situation where people cannot upgrade to a more recent OCaml because of some dependencies not ready for safe-strings. |
|
@kit-ty-kate Using |
These are important packages, though. I've submitted a PR for camlpdf. cc: @ncannasse regarding haxe. |
|
my system used OCS Scheme and Dypgen neither of which works with safe strings. I have cut back OCS Scheme to use only safe strings and removed a lot of Scheme due to removal of big nums. I have modified Dypgen's run time library but have been unable to fix the parser generator executable itself. I do not use opam and will not do so. It doesn't work on Windows and it introduces an unnecessary dependency since my system only builds a single Ocaml executable (which is a compiler which generates C++, the resulting code having no dependency on Ocaml). The original authors of the two packages I have used have failed to upgrade them or take notice of my changes. Breaking legacy code without an automated upgrade path is a symptom of poor design and a failure to support client base which does not suggest reliability or stability of the Ocaml system. |
The pressure comes from the OPAM repo maintainers, who systematically push back against patches that just change the compilation options of a package (in the |
Upstream authors still had the possibility to add -safe-string in their build system; as long as the default configure mode did not prevent that, the pressure to actually modify the code was still gentle. Switching the default configure mode to break such package is probably necessary at some point, but already a bit harsh (users of those package will have to using a special compiler switch, but it's fine with OPAM); then wait for the next release to drop completely support for unsafe strings. |
|
Also, @skaller reports that more packages are apparently not ready for safe strings, including some of them part of OPAM. One would need to be more confident in the impact assessment before breaking the default mode, I'd say. |
|
I think the next step would be to turn warnings into errors. Regarding haxe we already moved to safe strings in our 4.x version which is currently in development. |
|
@alainfrisch I'm basically in agreement with you, and let's not forget orphaned/unmaintained packages too. |
|
|
I keep thinking it would be good to have a repository for orphaned packages to add patches for such things as safe strings. |
|
Please lets not forget not everyone uses OPAM. First it doesn't work on Windows. I mean the real Windows using MSVC tools. Second, requiring OPAM in a system which has a mix of tools is a major extra burden for non-Ocaml people. My own system is an example: it's a programming language with a compiler written in Ocaml that generates C++, uses a run time library written in C++, and is designed to interface with C++. I expect clients to install a C++ compiler and several C/C++ libraries but they shouldn't need to know about Ocaml at all. As it is, weak to non-existent support on the most common desktop platform, Windows, makes installing Ocaml a nightmare already. The fact is the attempt to fix the design fault in the Ocaml compiler causing breakages, together with the removal of bignum support, is enough of a pain to seriously consider using Haskell or even C++ or bootstrapping my system to avoid such issues in the future. In my view unless an automatic upgrade tool can be written, the change has to be deemed ill considered. And while on the issue Lexing should be fixed so it can take a value of type bytes (not just string). |
|
Dear @skaller, we are happy to support non-OPAM workflows. It just comes in the discussion as the most commonly-used package manager for community-contributed OCaml libraries, and thus an easy place for us to monitor to understand the impact of language changes. It also happens that patches to OCaml software get applied at the OPAM level in some occasions, but the author/owner/maintainer of the relevant code is notified, and the patch is publicly available for everyone to apply on their own copy of the software. (In fact OPAM maintainers frown upon those opam-only changes and are trying to stop doing that completely.) I agree with you that an automated code-migration tool would be nice, but so far none exists for OCaml programs to my knowledge. If you know someone willing to invest the effort in building one, please let us know! Regarding Lexing, are you asking for a |
|
Hi @gasche, yes I understand OPAM is popular and serves well as a monitoring tool, and is a pretty good package manager for unix like systems. A bit of a pity its not written in Ocaml using Ocaml's platform independent facilities, but then that would be a lot of work to support mixed Ocaml/C packages. My comment on migration comes at least from a lot of experience with those kind of issues on the C++ committee regarding both retaining C compatibility as well as compatibility with older C++ versions. Of course the installed base is very much larger there and over much weight given to compatibility can inhibit advances requiring restructuring. However generally if you have very large code bases (millions of lines of code using large numbers of third party packages), an upgrade that leaves such a large client in the lurch is not a good way to encourage growth of the client base. The general rule is to leave old stuff alone and add new stuff, and that rule was broken here. I've seen some pretty sophisticated analysis tools for Ocaml so my feeling is that until someone produces an automatic upgrade tool, enforcing an upgrade simply isn't viable. I don't have the resources to produces such a tool. I also wouldn't have fixed the compiler bug by forcing users to change perfectly good code. Regarding Lexing, Lexing.from_bytes is what might help when parser input is constructed non-functionally: Dypgen at least does that, adding #line directives after purely functional text construction, seemingly this was a hack added after the original code was written. Anyhow it seems surprising that the primary use of Lexing.from_string is going to be from constructed text or, possibly text loaded into a buffer from a file, suggesting Lexing.from_bytes would be useful. Lexemes that require modification are rare and lexemes are usually fairly short so manually converting a string to bytes when required is OK in my opinion (i.e. no Lexing.lexeme_bytes seems required). |
|
In OPAM I count 80 packages whose latest version is broken by safe-string, and whose package description is not yet updated (i.e. they fail to compile but are not yet marked incompatible with 4.06+). And 9 other packages that depend on some of these 80. And that's not even counting the packages that are marked incompatible but don't have a newer version. Or any software that's not on OPAM. I think it's too soon to force safe strings on the OCaml world at this time, so I'll mark this PR as work-in-progress. |
Is this list available anywhere? |
|
https://damiendoligez.github.io/opamcheck-results/2018-06-29/index.html Look for a red uppercase X in the first column. |
Thanks! Many failures due to |
So it should be possible to fix many failures at once just by fixing what oasis generates? |
In theory, yes (at least if the setup script is not committed). I am looking into it. |
|
Even if the setup script is committed, it should still allow an easier update for the maintainer, yes? |
|
Kate (2019/08/28 12:53 +0000):
> One test fails because it uses the `-unsafe-string` option so that can probably be fixed.
What test is it? I can't find it and all the tests that are failing in
the logs seem to be due to the difference between bytecode and native
compiler. At least from what I can see
See `tests/ppx-contexts/test.ml` and look for "unsafe-string" there.
Note however that if the `-unsafe-string` command-line flag is removed
and the test.compilers.reference file modified accordingly, then the
test works again but there may be a deeper modification to do here.
|
It was already fixed by ef1ae7c, I'm guessing you must have looked at a previous CI run. |
|
Kate (2019/08/28 06:22 -0700):
> See `tests/ppx-contexts/test.ml` and look for "unsafe-string" there. Note however that if the `-unsafe-string` command-line flag is removed and the test.compilers.reference file modified accordingly, then the test works again but there may be a deeper modification to do here.
It was already fixed by
ef1ae7c, I'm guessing you must have looked at a previous CI run.
I'm sorry, I just tried the tests on trunk with the
`--enable-force-safe-string` configure option. My bad.
|
…n ocamlopt.opt and ocamlopt.byte after switching to force-safe-string by default
|
I added a commit disabling the binary diff test for the few tests that are actually different with force-safe-string enabled. If you prefer to have this commit as a standalone PR, tell me and I'll do it. If someone is not convinced and would prefer one of the other solutions to the problem as discussed in #8853, I can remove this commit and we can discuss it further. |
|
@kit-ty-kate, regarding your commit So, are we sure this can't be fixed? It would be good to document in FWIW, it's possible to embed OCaml comments in ocamltest blocks. |
|
@shindere - it should be fixable by altering ocamlc to share string constants in the same way as clambda does (it's noted in #8853), but I don't think it's necessary to hold up this PR waiting for that to be resolved. @mshinwell and I checked carefully that those 3 failing tests differ only because a marshalled data structure has a slightly different physical representation when force-safe-string is enabled. |
|
David Allsopp (2019/09/10 01:41 -0700):
@shindere - it should be fixable by altering ocamlc to share string
constants in the same way as clambda does (it's noted in #8853), but I
don't think it's necessary to hold up this PR waiting for that to be
resolved. @mshinwell and I checked carefully that those 3 failing
tests differ only because a marshalled data structure has a slightly
different physical representation when force-safe-string is enabled.
Then I'd say the comment could (should) refer either to #8853, or to the
comment which explains this.
|
…refer to an explaination of the problem
|
I just merged, thanks a lot for your work on this.
|
|
I'm still worried that the move to use |
|
Gabriel Scherer (2019/09/10 08:54 -0700):
I'm still worried that the move to use `-force-safe-string` now is too
early. I would have preferred to wait more, for example after 4.06
hits Debian stable -- so that everyone lives in the safe-string world.
Oh, I'm sorry that I merged so promptly then. My apologies.
Shall I revert?
|
Unsafe-strings have been deprecated for the past 5 years. Debian hasn't even upgraded ocaml above 4.05.0 in testing or even unstable. I don't think sticking with debian on this one is going to be any good if we want to go forward, the same way of thinking could apply to any of OCaml's breaking features in any versions since 4.05. Furthermore, we're not disabling unsafe-string completely, it would be still possible to have |
|
By the time 4.10 lands, Debian unstable should be at, or nearly at, 4.08 so they’ll be in the safe string world - and are unlikely to package 4.10 itself for a while (based on current packaging) |
Fix error in patch to configure.ac from #1859, with follow-on corrections
|
For your information, we do not plan on moving our (large) code bases to safe strings at the moment. In my opinion the removal of the compiler option -unsafe-string is premature and a bad decision. Let me comment on the rationales given for this PR: Safe strings have been available since 4.02 (4 years ago) Very few opam packages still use unsafe strings: I'm only aware of haxe.3.4.7 and camlpdf.2.2.1, maybe some others still use it under the table but very few of them remain. Deleting code is always good, it simplifies maintenance. As far as i can see, the rationales given for this PR are not valid at this stage, and this PR should not be merged. |
|
@tassig it was already merged and in the latest OCaml already. |
Note that if you need unsafe strings, you can configure with
My own personal understanding is that the migration towards safe strings is actively pushed for two main reasons:
|
flambda already takes (a small) advantage of immutability in force-safe-string mode |
|
@tassig wrote:
You are taking information from the open source project, but not giving much back. What is your timescale for moving to safe strings? A year, a decade, or no firm plans? Are you missing some tooling to help you migrate, or is there some other blocker? Do you remain on older versions of the compiler (in which case you can stick with safe strings for as long as you want), or do you want to keep up with newer OCaml while using older features? The more information you provide about your constraints, the more practical it is for us to try and help you. As far as I can tell, @nojb's suggestion about using |
|
@tassig I am a researcher interested in backwards-compatibility concerns in programming language evolution, and I would be like to hear more about your problem. You are not the first to try to make such points in the OCaml community, but these feedback from professionals have yet to be summarised as lessons that the hobbyist/academic developer can lean on. Can you please contact me at guillaume.munch-maccagnoni@inria.fr ? I could not find your contact information on your profile. |
This PR forces safe strings globally, effectively ending the possibility for projects or modules to use unsafe strings (previously enabled by
-unsafe-stringor settingOCAMLPARAM=safe-string=0,_).Rationales are:
haxe.3.4.7andcamlpdf.2.2.1, maybe some others still use it under the table but very few of them remain.I think it would be good to have it in 4.08 but if not, at least this PR will remain open to be merged for a later release.