Skip to content

Stop comparing compiler outputs in the testsuite#12339

Open
lthls wants to merge 1 commit intoocaml:trunkfrom
lthls:stop-useless-bytecode-comparison
Open

Stop comparing compiler outputs in the testsuite#12339
lthls wants to merge 1 commit intoocaml:trunkfrom
lthls:stop-useless-bytecode-comparison

Conversation

@lthls
Copy link
Copy Markdown
Contributor

@lthls lthls commented Jun 29, 2023

While working on #12232 I got bitten by the comparison check failing.
I suspect that it's a harmless difference in memory layouts (usually sharing) between ocamlc.byte and ocamlc.opt, and with no way to actually get useful info out of the test failure, and no simple way to disable this check, I decided to propose this PR, which turns off the binary comparison of executables from the default bytecode rule.

I did check that running dumpobj on the two programs produced the exact same output in my case, so the difference is likely in the rest of the data stored in the executable.

Cc @shindere as it's an ocamltest patch.

@nojb
Copy link
Copy Markdown
Contributor

nojb commented Jun 29, 2023

Just to make sure I understand: are you saying that the pure bytecode executables produced by ocamlc.byte and ocamlc.opt may not be identical? I'm probably missing something, but I can't see how this can happen. Can you say a few more words about it?

@nojb
Copy link
Copy Markdown
Contributor

nojb commented Jun 29, 2023

For reference, we did the same thing for the native-code compiler in #9677.

@lthls
Copy link
Copy Markdown
Contributor Author

lthls commented Jun 29, 2023

If I remember correctly there are parts of the bytecode executable that are generated using Marshal (I believe at least the structured constants are generated this way). So depending on the actual representation in the compiler's memory, the generated output could be different for values that are structurally equal.
Given that there was nothing else that looked suspicious, and the binary diff was very small, I assumed that the issue must have come from something like that.

But if somebody knows how to investigate these errors and is willing to share tips, I could try to find out more precisely what's going on.

@gasche
Copy link
Copy Markdown
Member

gasche commented Jun 30, 2023

I would agree.

Quoting the justification provided by @nojb in the native issue:

The mismatch of binaries produced by ocamlopt.byte and ocamlopt.opt has been a source of failures in ocamltest for a while. These failures are not so easy to investigate and take up valuable developer time. Moreover, the consensus is that it is not really a bug if these binaries differ, as there are multiple reasons for this, [...].

This is all true for bytecode as well.

I find it frustrating, from a reproducible-builds perspective, that our compiler output is fragile in this way, but I don't see an easy solution (see the discussion in #8853 for an example of the difficulties) and I don't think that trying to check a property that we don't actually guarantee in practice is a good use of our workforce.

@shindere
Copy link
Copy Markdown
Contributor

shindere commented Jul 1, 2023 via email

@lthls
Copy link
Copy Markdown
Contributor Author

lthls commented Jul 1, 2023

I intend to continue to work on that and expect to be able to show someting soon, unless it is considered a bad idea.

I think it is a good idea. If you are open to suggestions, I would like the ability to choose which versions of the tools to use through a make variable (or something else that doesn't require re-configuring the compiler). In particular, I sometimes introduce bugs in the native compiler which make the native binaries unusable, and being able to run the testsuite with the bytecode versions of the tools would be very helpful for debugging.

@gasche
Copy link
Copy Markdown
Member

gasche commented Jul 1, 2023

I would go beyond @lthls: I very often have compilers in their default configuration (so in particular, native code is enabled), but where I have only built the bytecode tools. It is important for me that I can test the bytecode tools in this configuration. ocamltest and the testsuite Makefiles make this difficult for me today, and some ways to implement the change that you propose could make things worse.

@gasche
Copy link
Copy Markdown
Member

gasche commented Jul 1, 2023

(The reason to do this is that the bytecode tools are much faster to build, and the only thing I care about when working on the frontend -- parser, type-checker, etc.)

@xavierleroy
Copy link
Copy Markdown
Contributor

For reference, the default sequence of actions for a test is, currently,

  1. Compile to native code using ocamlopt.byt
  2. Run the native-code executable
  3. Re-compile to native code using ocamlopt.opt
  4. Compile to bytecode using ocamlc.byt
  5. Run the bytecode executable
  6. Re-compile to bytecode using ocamlc.opt
  7. Compare the two bytecode executables.

There used to be a step 4 "Compare the two native-code executables" but it was removed as unreliable. This PR proposes to remove step 8 for the same reason.

The "recompiling" steps 3 and 7 are pretty useless if we don't do anything with the produced executables. Initially, ocamltest would run them, but that was removed a long time ago as it was taking too much time.

So, I agree with the idea of compiling only with ocamlopt.byte + ocamlc.byte or only with ocamlopt.opt + ocamlc.opt, provided there's an easy way to choose which compilers are used.

As @lthls mentioned, when there's a bug in the native-code compiler, or when we're working on a new port, ocamlopt.opt and ocamlc.opt are unreliable, and using them to build a test case means that the compiler will crash before the test case can be run, making it impossible to use the test cases to locate the problem. When I did the POWER port of OCaml 5, I ended up compiling and running many tests by hand, without ocamltest, because of this. So, we must have a way to use only the .byte compilers when running the test suite.

@shindere
Copy link
Copy Markdown
Contributor

shindere commented Jul 17, 2023 via email

@shindere
Copy link
Copy Markdown
Contributor

shindere commented Jul 17, 2023 via email

@shindere
Copy link
Copy Markdown
Contributor

shindere commented Jul 17, 2023 via email

@gasche
Copy link
Copy Markdown
Member

gasche commented Jul 17, 2023

I'm curious on why/how you end-up in such situations, @gasche?

I have a local build of trunk which I rebuild incrementally after git pull. I create feature or bugfix branches from it, without cleaning the local artifacts. Sometimes I want to work on a type-checker bugfix, and I want to rebuild just the core target and then re-run the testsuite. Sometimes I want to work on a native-compiler change and rebuild ocamlopt.byte -- depending on the context I would or wouldn't want to build the .opt targets as well in this case.

@xavierleroy
Copy link
Copy Markdown
Contributor

So you'd like to use only ocamlc.byte and not e.g. ocamlopt.byte, even if it's there, right?

No, I'd like to use ocamlc.byte even if ocamlc.opt is there, and likewise for ocamlopt.byte even if ocamlopt.opt is there.

Assume we would have two environment variables:
OCAML_TESST_{BYTECODE,NATIVE}_COMPILER that would disable the
corresponding test if set to false, would that satisfy your needs?

I'm not sure what the "corresponding tests" are. I don't want to disable tests, just be able to run them with the .byte compilers and not the .opt compilers.

@shindere
Copy link
Copy Markdown
Contributor

shindere commented Jul 17, 2023 via email

@shindere
Copy link
Copy Markdown
Contributor

shindere commented Jul 17, 2023 via email

@gasche
Copy link
Copy Markdown
Member

gasche commented Jul 17, 2023

Have you considered using different worktrees?

I could create a new worktree for each feature branch, but that means I would have to do a cold build each time, and manage their garbage-collection manually. I could try to have one worktree per broad configuration (frontend changes only, ocamlopt, runtime, etc.), but that is also more work on my end, more disk space, etc. I already have one worktree per release branch plus trunk, and while I mostly use this partial-build workflow on trunk it occasionally happens on release branches as well, for example when backporting a patch.

I understand that some things have to be fixed at configuration time for various software-engineering reasons; for example, currently we can only choose one native backend at a time, and flambda is a configure-time option. Having to create worktrees for those is inevitable. But ideally it would be possible to run a subset of the testsuite without having to decide it at configure-time.

@lthls
Copy link
Copy Markdown
Contributor Author

lthls commented Jul 17, 2023

I was responding to Vincent's comment, actually.

My use-case is the same as Xavier's: the .opt binaries are unreliable (because I'm working on them), so I want to test ocamlopt.byte but not ocamlc.opt or ocamlopt.opt.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants