Add deadlock detection by Zoxc · Pull Request #1 · Zoxc/rayon

Zoxc · 2018-05-16T14:48:09Z

No description provided.

michaelwoerister · 2018-05-16T15:48:46Z

rayon-core/src/sleep/mod.rs

+    }
+
+    #[inline]
+    pub fn block(&self, deadlock_handler: &Option<Box<DeadlockHandler>>) {


I would prefer this to be called something like mark_blocked().

michaelwoerister · 2018-05-16T15:49:34Z

rayon-core/src/sleep/mod.rs

+    }
+
+    #[inline]
+    pub fn unblock(&self) {


Same here, mark_unblocked() seems clearer to me.

nikomatsakis

This looks good, but what do you think about simplifying the inferface?

What I have mind is this:

We always get a callback whenever threads go to sleep. Then the "mark-blocked" logic (and, I suppose, the "external waiter" logic) can move into rustc — with each callback, we track the number of active threads, and if we ever find that the number of active threads equals the number of blocks threads, we recover somehow.

The disadvantage I see to this is that we will get more callbacks. The advantage is that the Rayon API is quite a bit simpler (just a callback).

nikomatsakis · 2018-05-18T17:51:25Z

rayon-core/src/sleep/mod.rs

    }

+    #[inline]
+    pub fn add_external_waiter(&self) {


I'd like to see a comment on what an "external waiter" is

nikomatsakis · 2018-05-18T17:52:18Z

rayon-core/src/sleep/mod.rs

 //! Code that decides when workers should go to sleep. See README.md
 //! for an overview.

+use DeadlockHandler;


We should document the changes here in the README.md file for this module

It seems like it could just go in a little section at the end

I added a section on deadlock detection: https://github.com/Zoxc/rayon/blob/rustc/rayon-core/src/sleep/README.md#deadlock-detection

nikomatsakis · 2018-05-18T17:54:05Z

rayon-core/src/registry.rs

            let worker_thread = WorkerThread::current();
            assert!(injected && !worker_thread.is_null());
+            let worker_thread = &*worker_thread;
+            worker_thread.registry.sleep.add_external_waiter();


I don't really understand what an "external waiter" is

Oh, I ... guess I maybe do. It's just some external thread that is blocked waiting on Rayon to finish up?

What is the reason to have this? Like, wouldn't we want to be called back regardless if the worker threads have deadlocked?

If we did not have this, we'd immediately detect a deadlock when a thread pool is created, as all the workers are idle.

As I wrote on IRC, I am wondering now if we can remove the concept of "external waiter", and instead just only issue a callback when there is at least one blocked thread.

I've removed the external waiter counter.

Zoxc · 2018-05-21T01:19:36Z

This looks good, but what do you think about simplifying the inferface?

That doesn't seem beneficial to me. It leaks more internal details about Rayon (how many external waiter there are, when thread go to sleep), reduces performance due to callbacks. You wouldn't be able to use the mutex in Sleep, which means you'd need an additional mutex on the rustc side. I don't find more mutexes helpful for reasoning about code.

nikomatsakis

Looks good!

nikomatsakis · 2018-06-06T13:13:53Z

I'm happy with this new interface — it seems "sufficiently minimal". That is, informing Rayon when you are about to block the thread seems pretty reasonable, and getting a callback when there is no work other than your blocked threads seems good. It occurs to me that calling this deadlock is perhaps overstating the case -- it might be of course that one of your threads is blocked on I/O or something. But we can leave the name for now.

615: Implement RFC #1: FIFO spawns r=nikomatsakis a=cuviper This implements rayon-rs/rfcs#1, adding `spawn_fifo`, `scope_fifo`, and `ScopeFifo`, and deprecating the `breadth_first` flag. Fixes rayon-rs#590. Closes rayon-rs#601. Co-authored-by: Josh Stone <cuviper@gmail.com>

These adaptors consume may many elements before deferring to a base folder's fullness checks, and so they need to be performed manually. For the `filter`s, there's no way to do it manually (rayon-rs#632), so the specialisations just have to be removed. For `fold` and `find_any` this can be done with a `take_while`. This extends the octillion tests to confirm this behaviour. This makes a program like the following slightly slower compared to the direct `consume_iter` without a check, but it's still faster than the non-specialized form. ``` extern crate test; extern crate rayon; use rayon::prelude::*; fn main() { let count = (0..std::u32::MAX) .into_par_iter() .map(test::black_box) .find_any(|_| test::black_box(false)); println!("{:?}", count); } ``` ``` $ hyperfine ./find-original ./find-no-check ./find-check Benchmark #1: ./find-original Time (mean ± σ): 627.6 ms ± 25.7 ms [User: 7.130 s, System: 0.014 s] Range (min … max): 588.4 ms … 656.4 ms 10 runs Benchmark #2: ./find-no-check Time (mean ± σ): 481.5 ms ± 10.8 ms [User: 5.415 s, System: 0.013 s] Range (min … max): 468.9 ms … 498.2 ms 10 runs Benchmark rayon-rs#3: ./find-check Time (mean ± σ): 562.3 ms ± 11.8 ms [User: 6.363 s, System: 0.013 s] Range (min … max): 542.5 ms … 578.2 ms 10 runs ``` (find-original = without specialization, find-no-check = custom `consume_iter` without `take_while`, find-check = this commit)

Add deadlock detection

f541d2f

Zoxc mentioned this pull request May 16, 2018

Blocking Rayon queries rust-lang/rust#50699

Merged

michaelwoerister reviewed May 16, 2018

View reviewed changes

nikomatsakis approved these changes May 18, 2018

View reviewed changes

Zoxc added 2 commits May 27, 2018 07:14

Changes

73334eb

Remove external_waiters

77b0dc3

Zoxc force-pushed the rustc branch from fbe71ef to 77b0dc3 Compare June 2, 2018 15:06

nikomatsakis approved these changes Jun 6, 2018

View reviewed changes

Update version

43740b3

Zoxc closed this Jun 6, 2018

Conversation

Zoxc commented May 16, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nikomatsakis left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Zoxc commented May 21, 2018

Uh oh!

nikomatsakis left a comment

Choose a reason for hiding this comment

Uh oh!

nikomatsakis commented Jun 6, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants