Implement opt.background_thread by interwq · Pull Request #714 · jemalloc/jemalloc

interwq · 2017-03-24T02:32:51Z

Added option "background_thread" to enable background async threads, which
handles purging currently.

This resolves #149.

interwq · 2017-03-27T23:01:29Z

I will be leaving this for a while as this depends on #501.

I'll work on #701 and #598 first and come back to this later.

interwq · 2017-04-06T00:21:00Z

I'll keep working on the fast path improvements for the next few days. Will come back and re-open this next week.

interwq · 2017-04-13T01:20:16Z

Reopening this as I'm switching back to this part.

interwq · 2017-05-01T21:26:47Z

Still missing stats, tests and documentation. But the design is pretty much complete. I feel we should discuss a bit before going further.

Split into two commits:

"First step of implementing background threads." includes the basic support for enabling background threads with fixed interval.
The main thing worth mentioning here is the epoch used for synchronizing thread creation and lifetime, and support toggling on/off opt.background_thread (the mallctl is rw). A global epoch indicates state (even number means enabled), while each background thread updates its index epoch to the current global epoch at the beginning of its execution. Whenever a background thread detects a mismatch between the index epoch it holds and the global epoch, it terminates itself.
"Implement dynamic interval for background threads."
First change is we keep arena tick running (to advance epoch) even if background thread is enabled, but we do not attempt actual purging in the decay_tick case.
The main change is, we read the backlog and computes how many dirty pages we expect to purge after the next N decay_intervals. We get the value of N through an optimized binary-search.

Any input? Or should I proceed with the remaining details (adding stats, tests and doc)?

interwq · 2017-05-02T21:39:52Z

Still updating. Will ping again when ready for review.

interwq · 2017-05-04T00:53:39Z

@jasone : this should be ready for more reviews. Some changes regarding the feedback: 1) I simplified the interval search algorithm to try 1 point at a time, instead of 3; and 2) I ended up not abbreviate "background_thread". Other feedback was all addressed accordingly. Also, I squashed the 2 commits into 1.

From checking the purging progress and intervals / predictions locally with t-test1, I believe this is working as expected.

Still missing tests, doc and some stats I'd like to add.

jasone · 2017-05-22T18:19:20Z

Having not yet reviewed the latest version, I don't understand why this approach should be vulnerable to inactive arenas ending up with unpurged memory. In our earlier discussion we agreed that if an application frees memory that generates unused dirty memory, and that dirty memory would potentially go unpurged for too long due to the background thread's scheduled wakeup time being too far in the future, we could signal the thread so it could recalculate and shorten its sleep time. Unless there were some threshold below which we don't bother signaling, we would never sleep forever with unpurged dirty memory.

interwq · 2017-05-22T18:23:36Z

@jasone : this could only happen if epoch stops advancing due to inactivity. As long as we advance epoch, you are right we signal the threads when reaching the threshold. But in a scenario an arena frees a bunch of pages w/o advancing epoch, and no active calls afterwards, it will cause the issue.

jasone · 2017-05-22T18:38:43Z

Ah, I see. I was suggesting that we do the check that decides whether to signal the background thread whenever we accrue unused dirty pages, rather than only at epoch advances.

interwq · 2017-05-22T18:48:11Z

Yeah that would partly solve the problem. The issue is in that case, even if we wake up the thread, it won't really be able to "see" the newly freed pages, because the epoch hasn't passed yet so it's not recorded in the backlog. Also, checking the wakeup time of the background thread requires the thread mutex, I'm a bit concerned with taking the mutex eagerly on deallocation.

jasone · 2017-05-22T18:41:17Z

src/background_thread.c

+		decay_purge_interval_search(decay, lb, ub, &next_lb, &next_ub);
+		lb = next_lb;
+		ub = next_ub;
+		assert(++n_search <= 5);


Can we make this generic, rather than hard-coded, as some function related to lg_floor(SMOOTHSTEP_NSTEPS) (and decay_time)?

jasone · 2017-05-22T18:46:45Z

src/background_thread.c

+	/* With the current setting, bisect 5 times reaches ~6 intervals. */
+	unsigned n_search = 0;
+	size_t next_lb, next_ub;
+	while (ub > lb + 7) {


What is the intent of the magic number 7 here? If the intent is to terminate the binary search early to reduce computation, I'd argue that the termination condition should be based on how many elements fall within an acceptable time range, which differs according to both SMOOTHSTEP_NSTEPS and decay_time.

Just updated to remove the arbitrary number here. Please see new comments for the changes.

jasone · 2017-05-22T18:49:24Z

src/background_thread.c

+
+/* Check if we need to signal the background thread early. */
+void
+background_thread_interval_check(tsdn_t *tsdn, arena_t *arena,


This function appears to operate as I'd expect, but IMO it should be called whenever the number of unused dirty pages accrued since the last epoch advance or signal reaches a new high water mark.

jasone · 2017-05-22T18:54:04Z

Can we make the wakeup time atomic, so that no mutex acquisition is necessary?

Re: the current epoch being in progress, for generality we should consider any associated unused dirty pages (i.e. the newly created ones) as being at the beginning of their decay. As long as we treat them as such, we never have the potential for sleeping forever with unpurged pages.

interwq · 2017-05-22T19:04:54Z

re: dirty pages now treated as in current epoch. This sounds good to me.

re: the mutex -- we can make the wakeup time atomic, but may still got the race condition where we read a non-indefinite sleep time and decide not to wakeup, while the background thread was in the progress of deciding to sleep forever. The synchronization feels pretty vulnerable w/ only atomics. (right now it need both decay and thread mutexes). Also we could miss the signal w/o holding the mutex while sending.

Do you think we should still check for background threads interval eagerly? My concern is either mutex overhead, or racy logic w/o mutexes.

interwq · 2017-05-22T20:52:49Z

I think we can try this:
Add a simpler version of the current interval check to be called eagerly, and this simple function does:

if (sleep_time == INDEFINITE) { 
    check_interval(...); // this one does the current full logic
}

So this handles the inactive issue lock-free, and does locking when necessary. Does this sound reasonable to you @jasone ?

interwq · 2017-05-23T03:43:08Z

@jasone: just updated as we discussed. Added a atomic boolean to indicate whether there is a wakeup scheduled (wakeup time is 64-bit, which would require 64-bit atomic if we switch that). This is added in a separate commit "Check for background thread inactivity on extents_dalloc."

Now I believe the inactivity issue is taken care of: either the application thread will see that the thread is sleeping for ever, or the background thread will see the dirty pages updated by the application threads.

jasone

I think this is ready to go other than addressing the comments I made earlier today in arena_decay_compute_purge_interval_impl(). Thanks for putting a ton of effort into polishing this, even after some of my design suggestions were total dead ends.

jasone · 2017-05-23T04:27:30Z

include/jemalloc/internal/background_thread_inlines.h

+background_thread_wakeup_time_set(tsdn_t *tsdn, background_thread_info_t *info,
+    uint64_t wakeup_time) {
+	malloc_mutex_assert_owner(tsdn, &info->mtx);
+	if (wakeup_time == BACKGROUND_THREAD_INDEFINITE_SLEEP) {


Easier to read?:

atomic_store_b(&info->indefinite_sleep, wakeup_time == BACKGROUND_THREAD_INDEFINITE_SLEEP, ATOMIC_RELEASE);

Yup. Changed.

jasone · 2017-05-23T04:35:06Z

src/background_thread.c

+	if (info->npages_to_purge_new > BACKGROUND_THREAD_NPAGES_THRESHOLD) {
+		should_signal = true;
+	} else if (unlikely(background_thread_indefinite_sleep(info))) {
+		if (extents_npages_get(&arena->extents_dirty) > 0 ||


Maybe combine this conditional with the enclosing one? I don't have strong feelings either way.

I did start with this branch combined, but looked pretty strange and hard to interpret. Plus the background_thread_indefinite_sleep is kind of more special, so I feel the current way is a bit better.

interwq · 2017-05-23T04:59:35Z

Thanks for the quick response and all the feedback @jasone!

Yeah I'm still tweaking the search algorithm in arena_decay_compute_purge_interval_impl() to incorporate the feedback. I'll let you take another look after I update it there.

interwq · 2017-05-23T07:25:41Z

@jasone : Just updated the diff again. Main changes --

to address the feedback on arena_decay_compute_purge_interval_impl(): this function is redesigned and refactored. Please take a look again at the function (changed in the first commit). Now the search stopping condition is npurge_upper_bound - npurge_lower_bound < NPAGES_THRESHOLD.
fixed arena_destory and arena_reset: now before doing the destructive work, we terminate the corresponding background thread (a single thread), and re-create afterwards. This seems to be the safest way, also avoids some locking trickery.
Now all tests pass even with background_thread default on.

jasone · 2017-05-23T15:44:14Z

src/background_thread.c

+
+	size_t lb = BACKGROUND_THREAD_MIN_INTERVAL_NS / decay_interval_ns;
+	size_t ub = SMOOTHSTEP_NSTEPS;
+	/* Decay deadline offset is up to 2 decay intervals. */


I don't understand this comment, nor the reason for setting lb to at least 2. Is this another case of a magic number that only makes sense under default conditions, or perhaps only for limited numbers of dirty pages?

Oh this is because we set the epoch deadline to be 1 to 2 intervals here:
https://github.com/jemalloc/jemalloc/blob/dev/src/arena.c#L598

If we don't set it to minimal 2 intervals, we may not advance epoch by the next wakeup, in which case there won't be much useful work to do.

How about we change the comment to

/* Minimal 2 intervals to ensure reaching next epoch deadline. */

Ahh, got it. The comment update looks good to me.

Cool. I'll make a couple small tweaks and land this soon.

Added opt.background_thread to enable background threads, which handles purging currently. When enabled, decay ticks will not trigger purging (which will be left to the background threads). We limit the max number of threads to NCPUs. When percpu arena is enabled, set CPU affinity for the background threads as well. The sleep interval of background threads is dynamic and determined by computing number of pages to purge in the future (based on backlog).

To avoid background threads sleeping forever with idle arenas, we eagerly check background threads' sleep time after extents_dalloc, and signal the thread if necessary.

interwq force-pushed the background_threads branch 2 times, most recently from 08cd1f1 to 4da2fe7 Compare March 24, 2017 20:02

interwq closed this Apr 6, 2017

interwq reopened this Apr 13, 2017

interwq force-pushed the background_threads branch 2 times, most recently from 36e48e0 to 63b9457 Compare April 13, 2017 04:12

interwq force-pushed the background_threads branch 11 times, most recently from 26370ce to a17e515 Compare May 1, 2017 20:58

interwq changed the title ~~[Not ready for review] Prototype of background worker threads.~~ Implement opt.background_thread May 1, 2017

interwq force-pushed the background_threads branch from a17e515 to e3020dc Compare May 1, 2017 21:18

interwq force-pushed the background_threads branch from e3020dc to e510258 Compare May 2, 2017 21:39

interwq mentioned this pull request May 3, 2017

Decay algorithm adjustment #813

Closed

interwq force-pushed the background_threads branch 2 times, most recently from 432011d to f83f2e4 Compare May 4, 2017 00:46

interwq force-pushed the background_threads branch from f83f2e4 to d843493 Compare May 4, 2017 00:58

interwq force-pushed the background_threads branch 3 times, most recently from c0ad80e to 3dbbeec Compare May 22, 2017 08:14

jasone reviewed May 22, 2017

View reviewed changes

interwq mentioned this pull request May 22, 2017

Fix # of unpurged pages in decay algorithm. #845

Merged

interwq force-pushed the background_threads branch 3 times, most recently from abcfa8f to 14f2961 Compare May 23, 2017 03:37

jasone approved these changes May 23, 2017

View reviewed changes

interwq force-pushed the background_threads branch 2 times, most recently from 93033e0 to 38c58d4 Compare May 23, 2017 07:16

interwq force-pushed the background_threads branch from 38c58d4 to 2648ed0 Compare May 23, 2017 07:28

jasone reviewed May 23, 2017

View reviewed changes

interwq added 6 commits May 23, 2017 11:05

Add background thread related stats.

e7cfff1

Add profiling for the background thread mutex.

6a188ab

Add documentation for background_thread related options.

0833d7a

Add tests for background threads.

a5ede28

Check for background thread inactivity on extents_dalloc.

5bd0bad

To avoid background threads sleeping forever with idle arenas, we eagerly check background threads' sleep time after extents_dalloc, and signal the thread if necessary.

interwq force-pushed the background_threads branch from 2648ed0 to 5bd0bad Compare May 23, 2017 18:14

Conversation

interwq commented Mar 24, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

interwq commented Mar 27, 2017

Uh oh!

interwq commented Apr 6, 2017

Uh oh!

interwq commented Apr 13, 2017

Uh oh!

interwq commented May 1, 2017

Uh oh!

interwq commented May 2, 2017

Uh oh!

interwq commented May 4, 2017

Uh oh!

jasone commented May 22, 2017

Uh oh!

interwq commented May 22, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jasone commented May 22, 2017

Uh oh!

interwq commented May 22, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jasone commented May 22, 2017

Uh oh!

interwq commented May 22, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

interwq commented May 22, 2017

Uh oh!

interwq commented May 23, 2017

Uh oh!

jasone left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

interwq commented May 23, 2017

Uh oh!

interwq commented May 23, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

interwq commented Mar 24, 2017 •

edited

Loading

interwq commented May 22, 2017 •

edited

Loading

interwq commented May 22, 2017 •

edited

Loading