Skip to content

Problematic handling of calloc #135

@michael-schwarz

Description

@michael-schwarz

While reviewing #132, I realized our current handling of calloc (heap-allocation with zeroing out memory) is problematic:

  • For calloc we currently create an Array(Blob(\bot, size),1), whereas for malloc we create just a Blob(\bot,size). This probably stems from the fact that calloc takes void *calloc(size_t nitems, size_t size).
    If we want to stick to creating Array(Blob(\bot,_),_) it should probably at least be Array(Blob(\bot,size),nitems) instead of Array(Blob(\bot,size*nitems),1).

  • The more problematic thing is that we are potentially unsound:

#include<stdlib.h>
#include<assert.h>
int main(void) {
    int *r = calloc(2,sizeof(int));
    r[0] = 3;
    // We compute (alloc@4) → `Array(`Blob(3,size:8),length:1)

    int z = r[1]; // Due to to a bug we read supertop here

    assert(z == 3);  // If we fix the bug above we would claim that this succeeds, 
                     //  but in the concrete it fails as z is 0 here
}

This is hidden by a bug that causes us to always read supertop from calloced blobs because we get confused with offsets.

To fix this, probably the Blob domain would have to distinguish if something is the result of a calloc call (as opposed to malloc).

  • If it is, when writing into the blob:
    If the value is ⊥ and we are writing a into it (3 in this case), we should make a zero-initialized version of the same type as a (0 in this case) and join a with it (3 join 0 in this case). Then when we read, we also cover the possibility of reading somewhere where we have not written before. (If nobody has written before, we read bot which we then turn into top in eval_rv)
    Ideally, we would have it so the abstract values for the blob would be initialized to zero, but we don't know at allocation which type we will write into the blob, and if we guess wrong there, any other write would immediately yield supertop.

  • For memory that has been malloced:
    We can keep doing what we are currently doing, reading memory that has not been initialized is undefined behavior, so we can only read things someone has explicitly written.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions