Make the bootstrap process repeatable#11149
Conversation
|
Generally I like the idea, and most of the changes proposed sound fine. I don't have an opinion on the changes to float marshalling strategy. But: do we really need to marshal floats as part of the bootstrap? (I assume that this would come from float constants in the sources of |
|
I don't think we can prohibit floats - for example, the constant ocaml/stdlib/camlinternalFormat.ml Line 1479 in 0a9b6b8 and 1. and .0 this get emitted by this line pulled in by ocamllex from the merciless use of Bool.compare:Line 24 in 0a9b6b8 Just in case my description made it sound grander, the new flag to The other perfectly reasonable option would be either to actually prohibit bootstrapping on big-endian systems, just altering the first commit to reject the attempt or to accept that, a bit like regenerating |
|
Just for the sake of argument: why not always encode floats in little-endian mode? Are there any downsides to it (apart from making past marshalled floats incompatible with the new version)? |
Marshalling and unmarshalling of floats and esp. float arrays will be slower on big-endian machines, but:
The current implementation is designed so that no endianness change is needed in the common case where the same architecture marshals then unmarshals FP numbers. Maybe it's too clever for its own good. |
|
It also wouldn’t break incompatibility - there’d be no reason to remove the code for un-marshalling big-endian floats in intern. |
|
Given that no one’s throwing hands up in horror at the initial parts (common config + stripping CRCS), shall I split this, so that this PR gives repeatable bootstrapping on non-big-endian systems and then move the marshal part to a separate PR? If Xavier (as extern’s main author) is happy for the simplification, I’d be happy to open the PR as being the complete removal of big-endian extern-ing. IIRC we either benchmarked or subsequently found some Marshal-sensitive workloads as part of the removal of the blue colour, so those could be reused to check how bad the impact is on Z? |
|
Or, or course, just remove the Marshal stuff and block big-endian bootstrapping, but I don’t think anyone’s spoken up for that approach yet! |
|
I think your proposed PR split makes a lot of sense. |
e598f4b to
c6b6e14
Compare
|
Marshalling part removed, to be followed up separately. |
|
Why does the bootstrap procedure need to be disabled when either flat-float-array or force-safe-string are disabled? |
|
Two other questions: Do we really want that boot/ocamlc has a way to know it is different Why did you have to change the SUBST mechanism? This complication |
The disallowed options are those which would affect the generation of If you configure with Bootstrapping from a system which doesn't have flat float array doesn't fail per se, because the bytecode compiler doesn't containing any float array constants, but it does require a double-bootstrap (the first one will fail to reach a fix-point because of the configuration mismatch) - i.e. the first bootstrap systematically fails.
We don't have to - it's used solely to display better error messages if the boot compiler were to be accidentally used to do something it must never do (today, the boot compiler must never be used to compile C files, but we nearly did that by mistake with caml-tex (remembering the conversation starting at #10366 (comment)). I agree it would be very worrying for that knowledge to be used for actual compilation/build decisions. Originally, I added
It doesn't have to be - the main thing is that I needed was a way of getting a consistent |
269ae10 to
3575a4a
Compare
|
@shindere - I've pushed an alternate implementation in 4827373 which doesn't alter the generation in
It's slightly more invasive done that way, but equally the explicit nature of the boot compiler's configuration is also clearer. |
|
Many thanks, David!
Will try to discuss this with @damiendoligez.
|
damiendoligez
left a comment
There was a problem hiding this comment.
A few things look wrong to me. I'll also comment on the in_boot_compiler thing.
utils/config.fixed.ml
Outdated
| let standard_library_default = "<boot compiler>" | ||
| let in_boot_compiler = true | ||
| let ccomp_type = "<boot compiler>" | ||
| let c_compiler = "<boot compiler>" |
There was a problem hiding this comment.
| let c_compiler = "<boot compiler>" | |
| let c_compiler = "/ The bootstrap compiler shall not call the C compiler." |
There was a problem hiding this comment.
Is this just to improve the display in ocamlc -config?
There was a problem hiding this comment.
(oh, seen comments in the wrong order - I follow now)
tools/cmpbyt.ml
Outdated
|
|
||
| let skip_section name = | ||
| name = "DBUG" | ||
| name = "DBUG" || name = "CRCS" |
There was a problem hiding this comment.
This looks like a mistake. If your aim is perfectly reproducible builds, then certainly we should replace cmpbyt with plain old cmp. Also, stripdebug shouldn't be used to do two unrelated things (stripping debug info and moving/copying files around).
It's a code smell (admittedly not introduced by this PR) that cmpbyt and stripdebug have to be kept in sync when only one of them is really needed.
There was a problem hiding this comment.
It's not a mistake - at the point of the comparison the image that's just been built has both DBUG and CRCS sections but the image in boot/ will have neither.
I can refactor the build to use "stripdebug" (possibly with a rename) to the image just built and remove skip_section here - cmpbyt, already existing, has the benefit of displaying slightly more information than cmp when the file actually differs.
There was a problem hiding this comment.
Also,
stripdebugshouldn't be used to do two unrelated things (stripping debug info and moving/copying files around).
That's what the comment in tools/Makefile precisely says it's for (although I quickly find #340 (comment) from the when it was added)
You're laying traps for the future, and you're peppering the compiler with patches that are only relevant to misuses of the bootstrap compiler. I don't think this is a good trade-off. IMO it would be better to do away with
I've given an example above with |
|
Aboout cmpbyt in general, and independently of this PR, what about
having a command-line option to specify sections to ignore?
|
|
Thanks for the review, @damiendoligez! I have reverted the peppering ("season to taste") - it was not for users, but for core devs... this error in CI (which with the sentinel is less clear/explicit than this error: |
|
12127c6 removes the special cases from The version I've done so far removes the logical duplication between |
|
OK, last bit for this evening - this PR now makes no changes to cmpbyt. AppVeyor is revealing that I should have thought harder before trying to trick the compiler into compiling |
|
Will you please write a note here when you are happy with the code,
David?
|
42b29de to
019a362
Compare
|
The version just pushed should pass CI - I tidied up the recent commit history (the switch away from altering |
My suggestion was to set |
|
Ah, that makes sense! I've updated the extensions with that and tidied the history so that there's only a single bootstrap. I'd quite like to keep the interim commits rather than squashing (as they encode different, but working, approaches), if that's OK. The discussion in #11182 and the non-trivial bootstraps that entails provide a lovely testcase for automated repeatability, which I'd quite like to get on and add, therefore! I've pushed a Changes entry - is there a consensus to merge this PR? Incidentally, I've described this as "repeatable" in the title, but that's actually the final objective - this PR of course makes the bootstrap reproducible - repeatability comes in the next PR where we test that each bootstrap artefact can be reproduced from the previous boot artefact. |
|
Will review next week, unless it got merged at that time.
I guess the PR is best reviewed globally, rather than commit by commit?
|
damiendoligez
left a comment
There was a problem hiding this comment.
Looks good to me modulo one small change.
Makefile
Outdated
| .PHONY: partialclean | ||
| partialclean:: | ||
| rm -f utils/config.ml utils/domainstate.ml utils/domainstate.mli | ||
| rm -f utils/config.ml utils/config_boot.ml* utils/config_main.ml* \ |
There was a problem hiding this comment.
We don't use the ml* wildcard anywhere else in the makefiles, probably for good reason (it looks dangerous to me at least).
| rm -f utils/config.ml utils/config_boot.ml* utils/config_main.ml* \ | |
| rm -f utils/config.ml \ | |
| utils/config_boot.ml utils/config_boot.mli \ | |
| utils/config_main.ml utils/config_main.mli \ |
There was a problem hiding this comment.
That’s a very good point, thanks!
|
I must say, this PR makes me a bit nervous in that it makes the build of
the compiler a bit more complex than it already is, it feels. In
particular, as mentionned earlier, the idea would be to get rid of all
the `SUBST` machinery and have `utils/config.ml` produced by `configure`
rather than by make.
Is it really necessary to make the ubild more complex with regard to the
expected result? Is it worth it?
Thinking out loud: would it make sense to have a special configure mode
to enable building the bootstrap compiler?
|
This PR now no longer touches that at all - a future PR can start generating
I will be heretically honest - I don’t care about reproducible builds that much, but I do care about the second class contributor process we have where we have to have a core dev repeat bootstraps and manually merge them, because we can’t trust the process as it is now. So I think that the small bit of complexity to build
Bootstrapping is a development activity, though - you make changes on an already coldstart’d tree, bring the new system to stability and then use the coldstart’d compiler and the new one to do the bootstrap - reconfiguring to do that seems strange? An alternate dance could be to have a way in ocamlc to link |
011c429 to
2675e33
Compare
|
Do you attempt to cleanup the history even more, or to leave it that
way? Just asking.
In .gitignore, the line `/utils/config_*.ml*` has been added twice.
Also, sorry if you already explained, but, why do
`--disable-flat-float-array` and ` --disable-force-safe-string` prevent
from bootstrapping? Is it because that cwould introduce changes in the
format of `.cmi` files and make the artifacts of the two compilers
incompatible? If that's the case, isn't it true that any
suchmodification which would create incompatibilities will also have to
be tested?
David Allsopp (2022/04/21 05:03 -0700):
@dra27 commented on this pull request.
> @@ -115,7 +124,8 @@ configure: configure.ac aclocal.m4 build-aux/ocaml_version.m4 tools/autogen
.PHONY: partialclean
partialclean::
- rm -f utils/config.ml utils/domainstate.ml utils/domainstate.mli
+ rm -f utils/config.ml utils/config_boot.ml* utils/config_main.ml* \
That’s a very good point, thanks!
Could the definition of `IN_COREBOOT_CYCLE` be moved to
`Makefile.build_config.in` ?
Also, I think we might already have discussed this in other places, but
while reviewing hits PR I am wondering again. Why do we insist onmaking
our Makefile rules depend on the configuration files and sometimes on
the Makefiles themselves? To me this looks overkill and, I have to say,
slightly insane.
In `utils/config.common.ml`:
As this is a new file, I think its header should refer to the persoon
who created it, with proper affiliation (Tarides, I guess).
"Portions of the Config module common to the both the boot and main":
spurious "the".
Same remark on the header of `utils/config.fixed.ml`.
|
2675e33 to
21c3137
Compare
Don't permit a coreboot cycle if either force-safe-string or flat-float-array are disabled.
If Config.in_boot_compiler is true then all sections of ocamlc which would call the C compiler call Misc.fatal, since it assumes the configuration of boot/ocamlc.
Use the same configuration for boot/ocamlc regardless of the machine on which the bootstrap is performed. Now set Config.in_boot_compiler to true for boot/ocamlc.
The CRCs of interfaces are affected by whether the source files are LF or CRLF formatted. Since the boot images are not toplevels, perform a sledge-hammer expunge and remove the CRCS section in addition to the DBUG section when bootstrapping.
This reverts commit c9ad0da.
21c3137 to
db2006c
Compare
Use the same configuration for boot/ocamlc regardless of the machine on which the bootstrap is performed. Now set Config.in_boot_compiler to true for boot/ocamlc.
db2006c to
702f45a
Compare
The history's as I'd like it, yes :)
Oops - I've deleted the duplicate on the last rebase.
I removed the safe-string part - there was nothing wrong with it, but it was left-over from when this was based on pre-5.x trunk. If you've configured with
It's not part of the build configuration, though, surely
The processing of those files directly depends on the recipe in
There's nothing of mine in |
|
Many thanks for your explanations, @dra27!
I think I'd prefer a dedicated module rather than embedding even further
on the SUBST road, which I hope we can leave one day to rely entirely on
configure, rather than having two places where substitutions are done,
one of which is not that obvious to read.
I think it would also be nice to hear @damiendoligez's opinion on this
PR.
All in all, if the in_boot_compiler machinery is "just" to get more
readable messages, I wonder whether it's really worth the introduced
complexity, no matter how it is dealt with.
|
This PR makes a series of changes to the build for the ultimate goal of having
make bootstrapproduce the exact same images inboot/ocamlcandboot/ocamllexregardless of OS and architecture.This can be seen working in precheck#675. Each worker has successfully bootstrapped (search for "fixpoint reached" in the logs) and then executes
git statusand "nothing to commit, working tree clean" can be seen in the log.With this PR, it becomes possible to adapt the GitHub Actions tests to test bootstrap commits automatically (we already test all commits touching
configure.ac- this is the same idea) and so remove the need for core developers to repeat bootstraps by external contributors. Of course, we should continue to check during review that a bootstrap is actually needed.The basic changes required are:
--disable-flat-float-array.Config.in_boot_compilerto ensure thatboot/ocamlcgives a proper error if any attempt is made to utilise shared libraries or to compile C files (the build, naturally, already does not do this).utils/config.mlis generated with fixed values for the entire configuration.This gets us virtually the entire way. The last two changes are optional, but increase the portability of this.
CRCSsection as they don't do any dynamic code loading. So we save a small amount of space in the boot images by stripping theCRCSsection alongside theDBUGsection and, with that change,make bootstrapworks on Windows, regardless of source file formatting.Marshal. Big-endian architectures marshal floats in big-endian format (ARM's hybrid format is always marshalled little-endian). If we add a flag to force writing little-endian doubles, then s390x and ppc64 can also do the bootstrap in a repeatable way. In enabling this in the bytecode compiler, I've turned it on unconditionally, inasmuch as trying to write identical bytecode on all systems seems a worthy objective.The last part almost certainly wants splitting off to another PR, but I'm curious for feedback at this stage - especially before embarking on any changes to CI.