Symex: don't phi-merge Java dynamic objects with uninitialised data#2121
Symex: don't phi-merge Java dynamic objects with uninitialised data#2121smowton wants to merge 1 commit intodiffblue:developfrom
Conversation
Previously given code like "A a = null; if(x) a = new A(); // PHI", at the merge point marked PHI symex would merge two versions of the newly allocated dynamic object: the one initialised by the A() constructor, and the one carrying uninitialised data (an unbound symbol with a name like dynamic_object1#0) This means all subsequent accesses to the dynamic object would operate on `x ? dynamic_object1#2 : dynamic_object1#0`, stymying future constant propagation and expression simplification. By observing that in Java having a pointer to an object implies it has been initialised, we can safely throw away the dynamic_object1#0 case and thus improve constant propagation.
|
Is this redundant with #2125? |
|
No, the two problems are distinct. The null-pointer-might-be-accessed situation results in a pessimistic assumption about the pointer ( It is however related to #2125 in that both rely on a safe memory model to prevent me from accessing the memory associated with |
|
To phrase that differently, in the program A a = null;
if(nondet)
a = new A(1);
// PHI
if(a != null)
return a.somefield;There are two things preventing symex from propagating #2125 says that because the access is checked, at the line |
|
@tautschnig possible idea for removing language-specificity in Symex: the Then the test wouldn't be "is this Java," it would be "does the object have |
|
Discussed in person with @kroening: the agreed solution is for symex to ensure some explicit nondet value is written on |
|
Note on that: check that array allocations, in C, C++ and Java, are all subject to this rule (that some nondet init happens if there isn't an explicit zero init) |
|
Replaced by #2168 |
Previously given code like
A a = null; if(x) a = new A(); // PHI, at the merge point marked PHIsymex would merge two versions of the newly allocated dynamic object: the one initialised by the A()
constructor, and the one carrying uninitialised data (an unbound symbol with a name like dynamic_object1#0)
This means all subsequent accesses to the dynamic object would operate on
x ? dynamic_object1#2 : dynamic_object1#0,stymying future constant propagation and expression simplification.
By observing that in Java having a pointer to an object implies it has been initialised, we can safely throw away the dynamic_object1#0 case and thus improve constant propagation.
This needs tests writing, but adding early for comment.