Skip to content

Resolved race condition#396

Merged
rubyist merged 1 commit intogit-lfs:masterfrom
michael-k:race-condition
Jun 15, 2015
Merged

Resolved race condition#396
rubyist merged 1 commit intogit-lfs:masterfrom
michael-k:race-condition

Conversation

@michael-k
Copy link
Contributor

To reproduce the deadlock, put runtime.Gosched() right after workersReady <- 1. The q.authCond.Signal() is the being sent without any worker waiting at q.authCond.Wait(). Therefore all workers block at q.authCond.Wait() and the main goroutine blocks at wg.Wait() (deadlock).

Use case 1 q.transferables is empty:

<-workersReady is not (endlessly) blocking, because the additional go func closes the channel. The q.authCond.Signal() does not do anything.

Use case 2 q.transferables is not empty:

<-workersReady in combination with q.authCond.L.Lock() blocks until at least one worker is waiting (q.authCond.Wait()). Therefore the q.authCond.Signal() signals one of the goroutines to
run.

The size of the channel workersReady makes sure that every worker that passed q.authCond.L.Lock() before the main goroutine does not block at workersReady <- 1.

@michael-k michael-k mentioned this pull request Jun 15, 2015
To reproduce the deadlock, put `runtime.Gosched()` right after
`workersReady <- 1`.  The `q.authCond.Signal()` is the being sent
without any worker waiting at `q.authCond.Wait()`.  Therefore all
workers block at `q.authCond.Wait()` and the main goroutine blocks at
`wg.Wait()` (deadlock).

Use case 1, `q.transferables` is empty:

  `<-workersReady` is not (endlessly) blocking, because the additional
  `go func` closes the channel.  The `q.authCond.Signal()` does not do
  anything.

Use case 2, `q.transferables` is not empty:

  `<-workersReady` in combination with `q.authCond.L.Lock()` blocks
  until at least one worker is waiting (`q.authCond.Wait()`).
  Therefore the `q.authCond.Signal()` signals one of the goroutines to
  run.

  The size of the channel `workersReady` makes sure that every worker
  that passed `q.authCond.L.Lock()` before the main goroutine does not
  block at `workersReady <- 1`.
@michael-k
Copy link
Contributor Author

After the merge, all PR tests that timed out should be restarted.

@rubyist
Copy link
Contributor

rubyist commented Jun 15, 2015

Nice, that looks legit to me 👍

rubyist added a commit that referenced this pull request Jun 15, 2015
@rubyist rubyist merged commit 8b7e650 into git-lfs:master Jun 15, 2015
@michael-k michael-k deleted the race-condition branch June 15, 2015 22:15
@michael-k
Copy link
Contributor Author

Just for reference, here is the commit causing the deadlock: 5619c5c

@technoweenie technoweenie mentioned this pull request Jun 16, 2015
38 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants