Conversation
|
Thank you for the PR, it is nice to see Goblint getting this feature, I think it sets us apart from a lot of the competition. |
|
|
||
| let leq a b = | ||
| let result = leq_with_fct { equal = (fun (type a) k a b -> let module R = (val (fst (M.Key.info k) : a mt)) in R.leq a b) } a b in | ||
| if Messages.tracing && not (M.is_empty a && M.is_empty b) then Messages.tracel "hetmap" "%s\n%a\n%a\n%b\n" "leq" pretty a pretty b result; |
Check warning
Code scanning / Semgrep
trace functions should only be called if tracing is enabled at compile time
|
|
||
| let logable str f a b = | ||
| let result = f a b in | ||
| if Messages.tracing && not (M.is_empty a && M.is_empty b) then Messages.tracel "hetmap" "%s\n%a\n%a\n%a\n" str pretty a pretty b pretty result; |
Check warning
Code scanning / Semgrep
trace functions should only be called if tracing is enabled at compile time
sim642
left a comment
There was a problem hiding this comment.
Thanks for the support of this weird C feature. Allows us to tick another soundiness box: http://soundiness.org/.
| let setjmp_return_token = HM.token (module VD) (VD.zero_init_value (TInt (IInt, []))) "base: return" | ||
| let setjmp_globals_token = HM.token (module D) (D.bot ()) "base: globals" | ||
| let setjmp_volatiles_token = HM.token (module D) (D.bot ()) "base: volatiles" |
There was a problem hiding this comment.
HM.token has hidden internal state, which makes it incompatible with our incremental analysis since lookups from the hmap are based on those internally given IDs, which must match in order for the lookups to find anything at all.
I suppose it could coincidentally work for now since these are the only tokens created in a fixed order, but even in that case a TODO comment would be useful here pointing out the hidden global state, in case we ever encounter issues with it.
| ("longjmp", special [__ "env" []; __ "value" [r]] @@ fun env value -> Longjmp { env; value } ); | ||
| ("_longjmp", special [__ "env" []; __ "value" [r]] @@ fun env value -> Longjmp { env; value } ); |
There was a problem hiding this comment.
The value/status argument of longjmp has type int, which is not a pointer. Therefore the r specification for them is spurious: function call arguments are always read, but the specification is about accesses through pointer arguments.
|
|
||
| let name () = "setjmp" | ||
|
|
||
| module D = Queries.LS (* if the local method has performed a setjmp *) |
There was a problem hiding this comment.
What does this mean? "if the local method has performed a setjmp" describes the local state as a boolean, but it's actually a set of lvalues.
There was a problem hiding this comment.
What is the reason for keeping the set of jumpbuffers in the local state?
| (** explicit join with the old value is required or 0 can be lost from the possible setjmp return values. *) | ||
| let getg = (ctx.global var) in | ||
| let g = (HM.join data getg) in | ||
| if M.tracing then M.tracel "setjmp" "Longjmp: %a\n+ %a\n-> %a\n" HM.pretty data HM.pretty getg HM.pretty g; | ||
| ctx.sideg var g) |
There was a problem hiding this comment.
Side effects are by definition always joined into the existing state of the variable within the solver, so this seems unnecessary. If that isn't happening as intended, maybe there's some problem in the implementation of HM as a lattice.
| (* join the state from all the setjump buffers, env may refer to (generally just one global) *) | ||
| let vars = ctx.ask (Queries.MayPointTo env) in | ||
| if M.tracing then M.traceli "setjmp" "Setjmp vars: [%a]\n" D.pretty vars; | ||
| let data = List.fold HM.join (HM.top ()) (List.map ctx.global (D.elements vars |> List.map fst)) in |
There was a problem hiding this comment.
This fold is a HM.join and the identity element for join is HM.bot, so there's something off here. Joining with top (as it happens here) always should yield top, which would make this entire fold useless.
| @@ -0,0 +1,30 @@ | |||
| // PARAM: --enable ana.int.interval --enable ana.int.enums --set "ana.activated[+]" setjmp --set solvers.td3.side_widen never | |||
There was a problem hiding this comment.
Many of the added tests disable side effect widening completely, is it really necessary for them to pass and this feature to work? Not widening side effects poses a big problem for termination of the solver.
| @@ -0,0 +1,17 @@ | |||
| // PARAM: --enable ana.int.interval --enable ana.int.enums --set "ana.activated[+]" setjmp --enable exp.earlyglobs | |||
| @@ -0,0 +1,25 @@ | |||
| // PARAM: --enable ana.int.interval --enable ana.int.enums --set "ana.activated[+]" setjmp --set solvers.td3.side_widen never --disable exp.volatiles_are_top | |||
There was a problem hiding this comment.
There is another test named "counting-local" as 57/04, so it would be good to change the name of this test based on how it differs from the other one.
| @@ -0,0 +1,17 @@ | |||
| // PARAM: --enable ana.int.interval --enable ana.int.enums --set "ana.activated[+]" setjmp --set solvers.td3.side_widen never | |||
There was a problem hiding this comment.
Same here, there's another test with the exact same name.
| | Q.Longjmp exp -> | ||
| HM.singleton setjmp_return_token (eval_rv (ask_of_ctx ctx) ctx.global ctx.local exp) | ||
| |> HM.add setjmp_globals_token { ctx.local with cpa = CPA.filter (fun k v -> Basetype.Variables.is_global k) ctx.local.cpa } | ||
| | Q.VolatileLocals -> | ||
| HM.singleton setjmp_volatiles_token { ctx.local with cpa = CPA.filter (fun k v -> (not (Basetype.Variables.is_global k)) && Ciltools.is_volatile_tp k.vtype) ctx.local.cpa } |
There was a problem hiding this comment.
Here Basetype.Variables.is_global is used, which is a purely syntactic check. However, the local state of base analysis can also contain other things which are not syntactically global, but are effectively handled as globals, e.g. alloc variables and escaped variables. Shouldn't some of them also be passed around via these jumps?
Is it just a question of benefit or is it rather a question of soundness? For example, doesn't the apron analysis also need to handle this and do this extra join to be sound? Probably a harder question, but what happens with locksets across long jumps (if the combination of threads and long jumps is defined at all)? Because if locksets persist over long jumps, then the mutex analysis would also need such extra join for soundness.
The use of hmap is certainly an interesting approach to allow your analysis to manipulate states of other analyses. Likely not worth redoing at this point, but another technique that's used in Goblint is to use a |
This is indeed a soundness question. With the implementation as an analysis, one would have to add some default join action to all the analyses. Maybe this could be reduced to a one-line for all the analyses by including a module/functor with some default implementation? Otherwise, one would like to come up with a solution where changing the existing analyses would not be necessary. One possible solution would be to change the implementation to be a functor taking a You can find an example implementation for such a functor e.g. with |
| | _ -> ctx.local | ||
|
|
||
| let enter ctx lval fn args = | ||
| collect_volatiles ctx; |
There was a problem hiding this comment.
Why do the volatiles have to be collected on every enter?
That requires rethinking how the data is passed though because events are not available for those functors. So it might require adding one or two additional transfer functions to |
|
Thank you for your PR, it is has been very helpful to help us explore the design space and highlighted the intricacies of such an analysis. We have decided against merging this for now and will instead try to pursue a more generic approach. |
Co-authored-by: Benjamin Bott<bottbenj@users.noreply.github.com>
Adds an analysis to handle the
setjmpandlongjmpfunctions.The setjmp analysis collects local state of the other analyses with the
LongjmpandVolatileLocalsquery and then returns it when required with theSetjmpevent. Due to the path insensitive nature of analysis globals it is not necessary to touch them.The setjump analysis is able to collect arbitrary data from all analyses by using a heterogeneous map, but this also means it is not able to affect any specific state and can only perform domain operations on the whole map. This map domain is based on the hmap library, but included directly as it is pretty abandoned and it needed to add various functions to be able to implement a domain based on it.
This is currently only used by base, and i am not exactly sure which other analyses also benefit from support for it.