Conversation
|
It seems like the issue here is that this data structure sometimes also escapes to globals via the |
src/domains/queries.ml
Outdated
| | MayBeLess: exp * exp -> MayBool.t t (* may exp1 < exp2 ? *) | ||
| | HeapVar: VI.t t | ||
| | IsHeapVar: varinfo -> MayBool.t t (* TODO: is may or must? *) | ||
| | MayBeMultiple: varinfo -> MayBool.t t |
There was a problem hiding this comment.
I find this query quite confusing. I see the only use for this right now is to allow vareq to ask base about its WeakUpdates set.
The issue of a single varinfo representing multiple concrete variables is highly specific to base (and I guess vareq) because they pass reachable variables into called functions, allowing for this situation to arise. However, this isn't fully generic in the sense that every analysis should be considering MayBeMultiple because naively one might also expect octApron to consider this information to make weak updates. But it doesn't have to (AFAIK) because it never passes local state into called functions in the same way as base does, so the confusion doesn't arise. Also, it somehow assumes that vareq passes reachable locals into called functions in exactly the same way as base, but is that really the case? So it seems to me like this notion is specific to each analysis wrt how much it passes down.
Moreover, if it were fully generic, I'd expect it to be possible to fix #316 using the same query by having mallocWrapper claim all of the varinfos that it has returned to represent multiple concrete values (until we implement more precise uniqueness for mallocs). But again, this query wouldn't really work for that: mallocWrapper may answer the query for a malloced varinfo with true, but that wouldn't be in base's WeakUpdates set, so base answers with false and the meet of MayBool yields false. So for that kind of general use, it should be MustBool instead for a negated query like MustBeUnique, so if it's either in base's WeakUpdates or created by mallocWrapper, it wouldn't be unique any more.
But that doesn't really solve the first problem, that what is considered to be non-unique by one analysis isn't necessary non-unique for another.
There was a problem hiding this comment.
is highly specific to base (and I guess vareq) because they pass reachable variables into called functions, allowing for this situation to arise
I don't really think it is. As soon as the base analysis treats something as Multiple all other analyses must also do so or not answer EvalInt queries. This is because the points-to information that is used from base does not distinguish between inner and outer instances
because naively one might also expect octApron to consider this information to make weak updates
EDIT: In principle it would have to, but it does not track variables that have their address taken, escaping this conundrum.
If it ever were enhanced to also track such variables:
// PARAM: --sets ana.activated[+] "var_eq" --sets ana.activated[+] "octApron"
int rec(int i,int* ptr) {
int top;
int x = 17;
if(i == 0) {
rec(5,&x);
// Recursive call may have modified x
assert(x == 17); //UNKNOWN!
} else {
x = 31;
// ptr points to the outer x, it is unaffected by this assignment
// and should be 17
assert(*ptr == 31); //UNKNOWN!
}
return 0;
}
int main() {
int t;
rec(0,&t);
return 0;
}The question is just whether to fix all analyzes in this PR or make separate ones. The issue is that the octApron has x = 31 after the assignment, the address domain in base has ptr -> {&x} and asks for the value of x via the query system, to which octApron answers [31,31].
There was a problem hiding this comment.
Actually, after recompiling 🙃, it does seem to work with octApron, as it does not have x as a tracked local variable, because it does not track variables who have their address taken.
So the reason one does not need to take it into account in OctApron is precisely that.
There was a problem hiding this comment.
Something else must be wrong with octApron in this case because x has its address taken in rec(5,&x), so octApron shouldn't be tracking x at all.
There was a problem hiding this comment.
I think your comment overlapped with mine, I have since edited it. It works precisely because it does not track variables that have their address taken.
There was a problem hiding this comment.
There should probably be a comment in octApron then, explaining that due to vaddrof it doesn't need to consider it at function calls etc.
I'll add a comment
Still, I think it would make sense to flip the query to make it possible for mallocWrapper to also indicate the alloc variables as multiple/non-unique.
Sure, I'll flip it!
Actually, would it even be possible to pull this WeakUpdates stuff into a separate analysis instead of baking it into base, if other analyses should be considering it regardless of what base is doing? It
That is very complicated, it requires the reachable may-point-to sets (from the params) at function call. And if we query the base analysis for that, it is not too different from just embedding it.
There was a problem hiding this comment.
I was thinking about this query again and how it would behave if mallocWrapper also answered it such that any varinfo that it created is not unique, but somehow it still doesn't make sense the way I thought it would.
MayBeMultiple (before)
| Variable | Base | MallocWrapper | Meet (and, MayBool) |
|---|---|---|---|
| Escaped | true (in WeakUpdates) |
true (not created by mallocWrapper, top of MayBool) |
true (:heavy_check_mark:) |
| Malloced | false (not in WeakUpdates) |
true (created by mallocWrapper) | false (:x:) |
MustBeUnique (now)
| Variable | Base | MallocWrapper | Meet (or, MustBool) |
|---|---|---|---|
| Escaped | false (in WeakUpdates negated) |
false (not created by mallocWrapper, top of MustBool) |
false (:heavy_check_mark:) |
| Malloced | true (not in WeakUpdates negated) |
false (created by mallocWrapper) | true (:x:) |
MustBeMultiple (should be?)
| Variable | Base | MallocWrapper | Meet (or, MustBool) |
|---|---|---|---|
| Escaped | true (in WeakUpdates) |
false (not created by mallocWrapper, top of MustBool) |
true (:heavy_check_mark:) |
| Malloced | false (not in WeakUpdates) |
true (created by mallocWrapper) | true (:heavy_check_mark:) |
It seems to me now like the query should be MustBeMultiple because otherwise one of the varinfo multiplicity/uniqueness analyses needs to somehow subsume the other: base's WeakUpdates should also automagically contain all malloced variables?
There was a problem hiding this comment.
Interesting!
MustBeMultiple is also somehow not the right thing though. If something is in this WeakUpdates set, this just means that an outer copy may be reachable not that it must be.
So the name MustBeMultiple would be very confusing, since we need to weakly update all things that may be multiple, not just those that must be.
Regardless of this, I agree that MustBool is the correct lattice. I'll see how to make this change while not making it confusing. (Maybe some comment?)
There was a problem hiding this comment.
Maybe calling it IsMultiple is the right thing to do? It doesn't demand anything about must-multiplicity, but rather just asks each analysis whether it has made the varinfo multiple. And then MustBool is just the right lattice to correctly combine the answers of different analyses.
It muddies the water though because at some point we wanted all the queries to clearly be one or the other by name.
There was a problem hiding this comment.
I went with the solution from the last comment now!
|
Funny how GitHub claimed that there's a conflict with |
|
Yup, it was accidental. Feel free to revert or I'll do it when I'm back at
the office next week
…On Mon, Sep 6, 2021, 1:01 PM Simmo Saan ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In src/domains/queries.ml
<#310 (comment)>:
> @@ -127,7 +127,7 @@ struct
| MustBeProtectedBy _ -> (module MustBool)
| MustBeAtomic -> (module MustBool)
| MustBeSingleThreaded -> (module MustBool)
- | MustBeUniqueThread -> (module MustBool)
+ | IsMultipleThread -> (module MustBool) (* see #310 (comment) on why this needs to be MustBool *)
Besides renaming MustBeUnique -> IsMultiple you also renamed
MustBeUniqueThread -> IsMultipleThread, which I think has nothing to do
with this issue or discussion. Was renaming this query accidental?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#310 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADJME3KLHE4P5LBBTYNN6VLUASNPJANCNFSM5BK6RSFQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
The issue we discussed in #252 is that pointers to the locals of some invocation of the same function further down the callstack may be passed to a function, which may in turn modify this value.
analyzer/tests/regression/01-cpa/50-escaping-recursion.c
Lines 1 to 42 in 5e05144
In #252, @jerhard and I proposed a very elaborate solution, that would introduce a new variable
x_outerfor such a local variable when it is passed to the function where it is defined. Then, all references to the originalxin the wholeCPAwould be patched to point tox_outer. This would then be undone in the combine.If the situation where an
x_outeris already reachable for a function call to its defining function,x_outerwould be marked as multiple, and weak updates would be performed to it.This PR takes a somewhat more simple approach: Enhance CPA with a set of variables that must be weakly updated, because an outer instance of this variable may also be reachable. This way, one avoids introducing
x_outerand patching address sets, as the variablexnow abstracts all instances of this variable in the whole call stack insofar as they might be reachable from the current invocation.It works as follows:
In
[[enter]]#for a fundec:analyzer/src/analyses/base.ml
Lines 1872 to 1875 in 5e05144
When updating the value for a variable:
analyzer/src/analyses/base.ml
Lines 1070 to 1079 in 5e05144
In
[[return]]#analyzer/src/analyses/base.ml
Line 1772 in 5e05144
In
[[combine]]#This is a bit less precise than the other solution that was proposed in #252, as it assumes that any writes to the local may also be writes via pointers to the outer (reachable) values.
analyzer/tests/regression/01-cpa/50-escaping-recursion.c
Lines 30 to 32 in 5e05144
On the other hand, I think it is simpler as it does not require complicated patching of the address sets.
The code obviously still needs cleanup, but I think we can already discuss if we want that solution, before investing time to make it pretty.
What do you think @sim642 @vogler @vesalvojdani @jerhard ?