Skip to content

Conversation

@csegarragonz
Copy link
Contributor

@csegarragonz csegarragonz commented Mar 23, 2023

This PR is essentially explained in the code's comment:

WAMR performs some optimisations on the WASM modules at load time, whereby if no memory.grow op-code is found, memory is shrunk to the minimum (as specified by the default heap and stack size). This optimisation breaks Faasm's memory management, as we may sometimes want to enlarge the WASM's memory from outside WASM code. To prevent this optimisation, every time we build a WASM function for its usage in faasm (and include the faasm/faasm.h header) we link with this dummy function that uses the memory.grow op-code.

Note that this function is actually not called. See:
bytecodealliance/wasm-micro-runtime#1706

I want to think there's a cleaner hack with some linking magic that doesn't require us to include the faasm/faasm.h header, but I could not come up with it :/

UPDATE: after re-basing to the latest wasi-libc, running big workloads like LAMMS or DGEMM that really stress the memory usage I realised that we had a memory leak in our memory allocation. The problem surfaced as a warning in faasm when trying to munmap an address that was not the current brk offset.

The problem comes from the fact that we build dlmalloc in wasi-libc with HAVE_MMAP and HAVE_MORECORE. This means that dlmalloc will use both mmap/munmap and sbrk to ask for more memory to the system. However, in faasm, we only keep track of allocations/freeing with an sbrk-style offset. This means that before we relied on mmap and munmap behaving just like sbrk. Turns out that this is not the case, meaning that munmap might be called before sbrk-ed memory on top of it is freed.

In this commit I disable the HAVE_MMAP flag in dlmalloc. This does not mean that we don't support mmap/munmap calls, it just means that we only modify the heap with __sbrk.

This fixes the memory leaks I was seeing.

Also, I am happy to move this to a separate PR, but it would require releasing another sysroot version, and for the moment I am retrofitting it in this tag (alas, submission time closes up again).

@csegarragonz csegarragonz requested a review from Shillaker March 31, 2023 08:09
"-Xlinker --stack-first",
"-Xlinker --export=__heap_base",
"-Xlinker --export=__data_end",
"-Xlinker --export=free",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WAMR has the option to allocate memory inside the WASM module from the runtime, by calling the exported malloc and free symbols in wasi-libc's dlmalloc.

Hence why we need to export these two.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what scenario would WAMR call malloc and free in the wasm module? I can't think of a scenario where we need to allocate stuff from the runtime; all calls should pass in a pointer and pre-allocated buffer for the runtime to populate if they need some output. That said, it's been a while, and perhaps there's places we need to pass back a string which requires allocating memory.

All other memory manipulation that I can recall should happen via our implementation of sbrk or mmap (depending on the dlmalloc settings), and these will be called only by malloc and free when they need to grow the underlying wasm memory.

Copy link
Contributor Author

@csegarragonz csegarragonz Mar 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An example is when dealing with double pointers. In MPI, e.g. MPI_Cart_create, the WASM code calls a native function with a double-pointer as a parameter, for the called function to populate.

Let's say that WASM code calls a native function with a variable of type A** as a parameter. In MPI the memory for A is actually not allocated (i.e. the variable is just declared as A** foo;). When passed to the runtime, A** will just be a WASM offset. In order to populate the contents of A we need to:

  1. Allocate WASM memory for A
  2. Get a WASM offset for the newly allocated memory.
  3. Write the new WASM offset to A*.

The way we solve this in WAVM is, in 1, by adding a new memory page, and returning a pointer to the old __sbrk. This feels like a very blunt instrument, as we are calling the memory.grow opcode when we just need a few bytes of memory.

Alternatively, WAMR offers the option to call the WASM's malloc as defined in dlmalloc. Note that this way we are also allocating memory on the heap, but we are not unnecessarily adding new pages. If the malloc in dlmalloc can't find enough bytes, it will call __sbrk which will then call out to our runtime, and ask for a new page.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, nice. We could probably have done the same with WAVM, i.e. export the malloc and free functions from each module, then called that rather than do all the manual memory manipulation.

Anywho, thanks for the explanation 👍

"-Xlinker --max-memory={}".format(FAASM_WASM_MAX_MEMORY),
"-Xlinker --features=mutable-globals,simd128",
"-Wl,-z,stack-size={} -Wl".format(FAASM_WASM_STACK_SIZE),
"-Wl,--initial-memory={}".format(FAASM_WASM_INITIAL_MEMORY_SIZE),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The WASM linker already sets the initial memory to the minimum value, so I don't think we need to turn this knob.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reminds me of some other changes we made when building WAMR, can you have a look at the args we pass in the WAMR CMakeLists.txt and make sure they're still relevant? https://github.com/faasm/faasm/blob/main/src/wamr/CMakeLists.txt#L35

@csegarragonz csegarragonz mentioned this pull request Mar 31, 2023
2 tasks
@csegarragonz
Copy link
Contributor Author

csegarragonz commented Apr 3, 2023

@Shillaker when you check this one again, please read the updated description.

I am happy to split the changes in two separate PRs, so far they are just together so that I can use one toolchain tag for the experiments.

@Shillaker
Copy link
Collaborator

LGTM 👍

Re. update, I think we have gone through all possible different permutations of HAVE_MORECORE and HAVE_MMAP during the project's life, and I agree with sticking with just sbrk as it's more simple to reason about.

@csegarragonz csegarragonz merged commit 042475a into main Apr 3, 2023
@csegarragonz csegarragonz deleted the protect-memory-layout branch April 3, 2023 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants