Use flambda compiler optimization#928
Conversation
|
Quick note about tests and checking if semantics were compromised with
Hence, I presume that if |
|
Thanks for the PR! There are currently some merge conflicts so the diff also shows some unrelated things that are actually already on master. Is this just to add the flambda setup or is there already also evidence that flambda indeed provides some speedup? |
Iirc there was a gain of about 1/4 for the SQLite amalgamation when testing locally, but I am sure @mrstanb et al will have the exact numbers. |
Is this just to add the flambda setup or is there already also evidence that flambda indeed provides some speedup? There are indeed some performance improvements. Me and my teammates will make sure to post a follow-up with some stats. |
Indeed, we'll make sure to post the different results that we had as a follow-up. |
|
From my side the stats are as follows:
|
685e112 to
bb6fcf1
Compare
jerhard
left a comment
There was a problem hiding this comment.
Thanks for the PR! It's interesting to see that activating flambda already has quite some impact on performance.
Now, with this additional compiler option, it would be great to extend our Github workflows with one that nightly runs all the regressoin-tests with the flambda switch. To add such a workflow, you may add a file in .github/workflows that contains the regression job, which is already present in both the locked.yml (this uses a fixed version of compiler and dependencies) and unlocked.yml files.
There's no need to make a new workflow altogether. It suffices to just add one extra compiler version to the unlocked matrix. |
|
I updated the |
|
I think just The lower bounds jobs are completely separate and need not be modified. |
|
TLDR: After testing out the effects of activating different Flambda flags, we opted for using We started out by running Goblint with different flags over the SQLite Amalgamation. On our private machines, the flag combination To confirm those results, we then proceeded with benchmarking on the CoolMUC-2 HPC of the LRZ in order to achieve reproducible results. Using this configuration, the following measurements were made (format: hours:minutes:seconds):
The runs on the HPC-cluster confirmed our assumption of a noticeable time benefit of using Flambda. However, using just To ultimately take a decision on which flags to use, we continued our benchmarking with the coreutils programs of the On our local systems, we marked down this results: coreutils_benchmarks.md. The fastest configuration, again contradicting what we had observed on the HPC, was We then made some further batch jobs on the LRZ-cluster with
Once more, on the HPC, the fastest choice was just |
|
What are the built times on your machines with the flambda compiler with the regular |
|
I've benchmarked No flambda: AVERAGE = 9,777s
Flambda with -O3: AVERAGE = 26,075s
|
|
Thanks! Could you also get the numbers for development build ( |
|
So for No flambda: AVERAGE = 9,175s
Flambda with -O3: AVERAGE = 13,829s
|
|
I think this sort of slowdown for make is acceptable here. |
|
I would suggest given these numbers that we should default to |
|
Sure, and actually there's probably no need to have |
|
Yes, I would also be in favor of making |
…etup with base compiler
|
In my latest commit, I removed the Flambda with no options: AVERAGE = 12,200s
Furthermore, I made the |
Co-authored-by: Simmo Saan <simmo.saan@gmail.com>
Co-authored-by: Simmo Saan <simmo.saan@gmail.com>
…into use-flambda-compiler
As the title says, we've been working on adding flambda optimization
For this, some changes have been made in files, such as:
make.shgoblint.opam.lockedREADME.mdPlease note that
flambdais only supported inocamlopt, hence all the flags, passed for compiler optimization, are only passed asocamlopt_flags