Skip to content

Conversation

@devnexen
Copy link
Collaborator

@devnexen devnexen commented Feb 8, 2021

The actual code works fine on the usual Mac Intel however, some failures
occurs with some unit tests on the ARM h/w when zero'ing page range.
While at it, using MADV_FREE_REUSABLE which is Darwin's specific but helps
command line tools to report better memory accounting.

Copy link
Collaborator

@davidchisnall davidchisnall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this, a couple of comments inline.

template<bool page_aligned = false>
static void zero(void* p, size_t size) noexcept
{
::bzero(p, size);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is required only on AArch64 (is it only AArch64 macOS, or iOS derivatives as well?) then can we guard it on a constexpr if check that the AAL is for AArch64, not x86?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair enough. yes Mac M1 is "only" a super iOs device anyway.

static void notify_not_using(void* p, size_t size) noexcept
{
SNMALLOC_ASSERT(is_aligned_block<PALBSD::page_size>(p, size));
while (madvise(p, size, MADV_FREE_REUSABLE) == -1 && errno == EAGAIN)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe this is correct. MADV_FREE_REUSABLE is completely undocumented (and therefore Apple reserves the right to remove it or modify its semantics in minor version updates) but the comments in Apple's open-source libmalloc indicate that you must madvise with MADV_FREE_REUSE before reuse. We would need a notify_using implementation that does this. I don't know what the performance differences are between MADV_FREE and the pair of MADV_FREE_REUSEABLE + MADV_FREE_REUSE, if it's significant then we might want to provide both options to allow users to tune for lower memory or higher performance.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this doesd even appear on the man page. however it s not related at all to performance (it s mostly the same) but memory accounting "faithfulness" but fair point I can give up this part.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can tell, it's not just about accounting, it has different semantics. The traditional MADV_FREE has to track the dirty bit to tell whether you've used the page and so can require an IPI and TLB shoot-down to make the page read-only in the map before it's actually removed. This version can reclaim the page immediately but requires a system call (which explicitly synchronises any thread that might be about to use the page) before it can be reused.

It looks like it's a good fit for snmalloc, but we do need the corresponding notify_using function in the PAL.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be better in another PR as here it s mainly to fix some unit tests crashing whereas this is improvement.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, thanks!

The actual code works fine on the usual mac Intel however, some failures
occurs with some unit tests on the ARM h/w when zero'ing page ranges.
@davidchisnall davidchisnall merged commit 0a86848 into microsoft:master Feb 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants