-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Time/memory regression in Frama-C with OCaml 5.0 #11733
Description
This is a follow-up to #11662.
Patch #11673 indeed fixes the most critical memory issue, but when running several tests in parallel, we are still having some issues with high memory usage. These are not critical for OCaml 5.0, but we hope they might be fixed for 5.1.
When running our set of analyses, we observed the following differences between OCaml 4.14 and OCaml 5.0.0+trunk (ocaml --version gives 5.0.0+dev8-2022-10-12; note that we were unable to test with 5.1.0+trunk since the ppxlib opam package is currently incompatible with it):
Overall, the startup time for Frama-C is about 1.2s longer, which, for short benchmarks, can increase execution by 300% or 400%. This is mentioned in #11662. Even when only a subset of dynamic modules are loaded, there's still about 100%-200% increase in execution time. For large benchmarks such as SATE 6, which consist in >40k runs of Frama-C with a minimal set of plug-ins, this can double or triple total execution time. It was not entirely clear to us if the regression in time complexity mentioned in #11662 accounts for this difference. Otherwise, we'd like to help investigate further if there are other possible causes.
The main issue I'd like to mention here, however, is memory consumption: as seen in the table above, large case studies still show a significant increase in memory usage, leading to some heavy swapping (before #11662, swapping happened almost instantly; now it takes several minutes, but still happens).
I tried to obtain a minimal reproducible example that was very robust, but I was unable to, since memory usage is not entirely deterministic, at least in OCaml 5. In OCaml <= 4.14, we obtain a consistent value for maximum memory usage across different runs, but in OCaml 5, sometimes simply changing whitespace in the source file, or re-running Frama-C, can result in as much as 3x memory usage.
The file attached below (which is a C file, but Github does not recognize it, so I renamed it to .txt) is currently the best minimal example I could come up with. It is based on debie1, which had a 934% increase in memory usage in the table.
The command-line I used with it is the following:
/usr/bin/time -f 'user_time=%U\nmemory=%M' frama-c a-v5.c -eva -eva-msg-key=-final-states,-initial-state -no-autoload-plugins -load-module inout,eva,scope
In this "reduced" version, I get less than 200 MB of memory usage in OCaml <= 4.14, but > 1 GB in OCaml 5.
There is a "loop unroll" annotation in the code (line 2603) which can be modified to increase the memory usage difference between versions of the compiler. If we set it to 100, for instance, the OCaml 4 version uses only 100 MB, while the OCaml 5 version uses 150 MB. If I set it to 30000, the OCaml 4 version uses 392 MB while the OCaml 5 version uses 3.7 GB. Thus, this annotation could possibly help as a parameter during debugging.
My colleagues pointed out that there are possibly multiple issues; for instance, we know that analyses with -eva make heavy use of caches based on hashconsed values. These have been optimized over the years to perform as efficiently as possible, and it is likely that some of the heuristics used up to OCaml 4.14 may requires changes. There are some Frama-C options, such as -deterministic, which influence the behavior of such values. Playing with them might provide some insight.
In any case, please tell me if there are other tests that I could perform, or different setups to help understand the causes of the regression.
