Skip to content

Add deadlock detection#1

Closed
Zoxc wants to merge 4 commits intocratefrom
rustc
Closed

Add deadlock detection#1
Zoxc wants to merge 4 commits intocratefrom
rustc

Conversation

@Zoxc
Copy link
Owner

@Zoxc Zoxc commented May 16, 2018

No description provided.

}

#[inline]
pub fn block(&self, deadlock_handler: &Option<Box<DeadlockHandler>>) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer this to be called something like mark_blocked().

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed.

}

#[inline]
pub fn unblock(&self) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, mark_unblocked() seems clearer to me.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed.

Copy link

@nikomatsakis nikomatsakis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, but what do you think about simplifying the inferface?

What I have mind is this:

We always get a callback whenever threads go to sleep. Then the "mark-blocked" logic (and, I suppose, the "external waiter" logic) can move into rustc — with each callback, we track the number of active threads, and if we ever find that the number of active threads equals the number of blocks threads, we recover somehow.

The disadvantage I see to this is that we will get more callbacks. The advantage is that the Rayon API is quite a bit simpler (just a callback).

}

#[inline]
pub fn add_external_waiter(&self) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see a comment on what an "external waiter" is

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

//! Code that decides when workers should go to sleep. See README.md
//! for an overview.

use DeadlockHandler;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should document the changes here in the README.md file for this module

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like it could just go in a little section at the end

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let worker_thread = WorkerThread::current();
assert!(injected && !worker_thread.is_null());
let worker_thread = &*worker_thread;
worker_thread.registry.sleep.add_external_waiter();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand what an "external waiter" is

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I ... guess I maybe do. It's just some external thread that is blocked waiting on Rayon to finish up?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason to have this? Like, wouldn't we want to be called back regardless if the worker threads have deadlocked?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we did not have this, we'd immediately detect a deadlock when a thread pool is created, as all the workers are idle.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I wrote on IRC, I am wondering now if we can remove the concept of "external waiter", and instead just only issue a callback when there is at least one blocked thread.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed the external waiter counter.

@Zoxc
Copy link
Owner Author

Zoxc commented May 21, 2018

This looks good, but what do you think about simplifying the inferface?

That doesn't seem beneficial to me. It leaks more internal details about Rayon (how many external waiter there are, when thread go to sleep), reduces performance due to callbacks. You wouldn't be able to use the mutex in Sleep, which means you'd need an additional mutex on the rustc side. I don't find more mutexes helpful for reasoning about code.

Copy link

@nikomatsakis nikomatsakis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@nikomatsakis
Copy link

I'm happy with this new interface — it seems "sufficiently minimal". That is, informing Rayon when you are about to block the thread seems pretty reasonable, and getting a callback when there is no work other than your blocked threads seems good. It occurs to me that calling this deadlock is perhaps overstating the case -- it might be of course that one of your threads is blocked on I/O or something. But we can leave the name for now.

@Zoxc Zoxc closed this Jun 6, 2018
Zoxc pushed a commit that referenced this pull request Jan 5, 2020
615: Implement RFC #1: FIFO spawns r=nikomatsakis a=cuviper

This implements rayon-rs/rfcs#1, adding `spawn_fifo`, `scope_fifo`, and `ScopeFifo`, and deprecating the `breadth_first` flag.

Fixes rayon-rs#590.
Closes rayon-rs#601.

Co-authored-by: Josh Stone <cuviper@gmail.com>
Zoxc pushed a commit that referenced this pull request Jan 5, 2020
These adaptors consume may many elements before deferring to a base
folder's fullness checks, and so they need to be performed
manually. For the `filter`s, there's no way to do it manually (rayon-rs#632),
so the specialisations just have to be removed. For `fold` and
`find_any` this can be done with a `take_while`.

This extends the octillion tests to confirm this behaviour.

This makes a program like the following slightly slower compared to
the direct `consume_iter` without a check, but it's still faster than
the non-specialized form.

```
extern crate test;
extern crate rayon;
use rayon::prelude::*;

fn main() {
    let count = (0..std::u32::MAX)
        .into_par_iter()
        .map(test::black_box)
        .find_any(|_| test::black_box(false));
    println!("{:?}", count);
}
```

```
$ hyperfine ./find-original ./find-no-check ./find-check
Benchmark #1: ./find-original
  Time (mean ± σ):     627.6 ms ±  25.7 ms    [User: 7.130 s, System: 0.014 s]
  Range (min … max):   588.4 ms … 656.4 ms    10 runs

Benchmark #2: ./find-no-check
  Time (mean ± σ):     481.5 ms ±  10.8 ms    [User: 5.415 s, System: 0.013 s]
  Range (min … max):   468.9 ms … 498.2 ms    10 runs

Benchmark rayon-rs#3: ./find-check
  Time (mean ± σ):     562.3 ms ±  11.8 ms    [User: 6.363 s, System: 0.013 s]
  Range (min … max):   542.5 ms … 578.2 ms    10 runs
```

(find-original = without specialization, find-no-check = custom
`consume_iter` without `take_while`, find-check = this commit)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants