Skip to content

Thread kill can be interrupted by another interrupt #8501

@headius

Description

@headius

In ruby/net-http#197 I patched an issue in the net-http tests where a server thread could get stuck in a loop by being "double-interrupted": killed and then closed before the kill can start to propagate. The issue here is that JRuby runs these things in parallel, so internally, the following can happen:

  • Main thread kills blocked thread.
  • Blocked thread wakes up and processes the kill request, clearing its interrupt queue but not yet removing itself from the IO blocked thread list.
  • Main proceeds to close the IO, which sees the thread is still blocked and issues a second interrupt to raise IOError.
  • As the thread propagates the kill, it may run Ruby code and check for interrupts again; it sees the raise interrupt and propagates that instead of the kill.

These are tricky things to coordinate because there's a lot of shared state here: the interrupt queue and test bits, the IO's list of blocked threads, and the cleanup logic after the blocked IO call gets interrupted.

I am unsure if this is a bug exactly.

CRuby does not have the issue with the race when closing IO, because the kill and close happen rapidly and by the time the thread acquires the GVL both interrupts are there. The kill gets seen first, the queue is cleared, and the raise never happens.

But a similar case can be simulated in CRuby by putting a sleep in an ensure block, since ensure blocks are run when a thread is killed:

in_ensure = false
t = Thread.new do
  sleep
ensure
  in_ensure = true
  sleep
end
Thread.pass until t.status == "sleep"
t.kill
Thread.pass until in_ensure
t.raise
t.join

When run on CRuby, the raise will interrupt the sleep in ensure (simulating interrupt checks that may happen as the kill propagates) and cause the thread to now raise an exception rather than quietly being killed.

This case forces a race for CRuby, but it is an open question as to whether a thread that has been killed or "raised" can be forced to die in a different way given a second "raise" or "kill" call before it dies.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions