-
Notifications
You must be signed in to change notification settings - Fork 24.4k
purge jemalloc after flush, and enable background purging thread #6145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Background threads may run for a long time, especially when the # of dirty pages is high. Avoid blocking stats calls because of this (which may cause latency spikes). see jemalloc/jemalloc#1502 cherry picked from commit 1a71533511027dbe3f9d989659efeec446915d6b
…thread jemalloc 5 doesn't immediately release memory back to the OS, instead there's a decaying mechanism, which doesn't work when there's no traffic (no allocations). this is most evident if there's no traffic after flushdb, the RSS will remain high. 1) enable jemalloc background purging 2) explicitly purge in flushdb
It seeems that since I added the creation of the jemalloc thread redis sometimes fails to start with the following error: Inconsistency detected by ld.so: dl-tls.c: 493: _dl_allocate_tls_init: Assertion `listp->slotinfo[cnt].gen <= GL(dl_tls_generation)' failed! This seems to be due to a race bug in ld.so, in which TLS creation on the thread, collide with dlopen. Move the creation of BIO and jemalloc threads to after modules are loaded. plus small bugfix when trying to disable the jemalloc thread at runtime
|
@oranagra thanks Oran, the PR does no longer apply but it looks good to me. I wanted to ask a few things.
I also wonder if it could be more safe to have the default configuration tuned to get less stats from the allocators and provide less memory information to the user, otherwise we may always be a bit at risk of introducing unexpected latencies. Thanks. |
|
@antirez the fix in jemalloc is not mine, i reported a problem to them, and they fixed it (already released in jemalloc 5.2.1), i just cherry picked this fix into our repo (commit comment has all the details). The slowdown it caused to serverCron was when calling i don't think it's possible to get less stats, the you can argue that this background thread is largely unused and untested though, and that enabling it by default is risky. but i did some testing and benchmarks and it looks good to me. without it, as i mentioned, there are cases that won't return any memory to the OS. |
|
@oranagra thanks Oran, I guess it's safe to merge. Please could you rebase? Thank you. |
|
Oh actually the conflict is trivial, I can take care. |
|
While you're at it, I noticed I forgot to trim the old purge code from object.c, please trim.. (or wait a few minutes, I'll rebase and update the PR) |
|
i was already half way there.. so i pushed it now... |
|
:-) Thanks
…On Fri, Oct 4, 2019, 13:23 Oran Agra ***@***.***> wrote:
i was already half way there.. so i pushed it now...
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<https://github.com/antirez/redis/pull/6145?email_source=notifications&email_token=AAAQAYBLJY6HLTCFFWIWJN3QM4RT5A5CNFSM4HSCV7SKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEALLCDY#issuecomment-538358031>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAQAYF3D5X4XMDGDAA2IZDQM4RT5ANCNFSM4HSCV7SA>
.
|
|
Thanks @oranagra, merged. |
|
Will this PR be backport to 5.0? |
purge jemalloc after flush, and enable background purging thread
jemalloc 5 doesn't immediately release memory back to the OS, instead there's a decaying mechanism, which doesn't work when there's no traffic (no allocations).
this is most evident if there's no traffic after flushdb, the RSS will remain high.
p.s. i did many tests to check for regression (which is how i found 3 above).
when we use a non-async flushdb, it already blocks redis for so long, so purging (which is what would happen with jemalloc 4), doesn't matter.
with flushdb async, after populating a large db, and running traffic, i couldn't observe any performance regression.
and the most i've seen jemalloc thread use is some 50% CPU, while redis and the lazy free thread, each ate 99%.