Skip to content

Avoid unsynchronized access to scheduledTaskQueue in GlobalEventExecutor#10890

Merged
normanmaurer merged 1 commit intonetty:4.1from
ejona86:globaleventexecutor-race
Dec 24, 2020
Merged

Avoid unsynchronized access to scheduledTaskQueue in GlobalEventExecutor#10890
normanmaurer merged 1 commit intonetty:4.1from
ejona86:globaleventexecutor-race

Conversation

@ejona86
Copy link
Copy Markdown
Member

@ejona86 ejona86 commented Dec 23, 2020

Motivation:

A race detector discovered a data race in GlobalEventExecutor present in
netty 4.1.51.Final:

  Write of size 4 at 0x0000cea08774 by thread T103:
    #0 io.netty.util.internal.DefaultPriorityQueue.poll()Lio/netty/util/internal/PriorityQueueNode; DefaultPriorityQueue.java:113
    #1 io.netty.util.internal.DefaultPriorityQueue.poll()Ljava/lang/Object; DefaultPriorityQueue.java:31
    #2 java.util.AbstractQueue.remove()Ljava/lang/Object; AbstractQueue.java:113
    #3 io.netty.util.concurrent.AbstractScheduledEventExecutor.pollScheduledTask(J)Ljava/lang/Runnable; AbstractScheduledEventExecutor.java:133
    #4 io.netty.util.concurrent.GlobalEventExecutor.fetchFromScheduledTaskQueue()V GlobalEventExecutor.java:119
    #5 io.netty.util.concurrent.GlobalEventExecutor.takeTask()Ljava/lang/Runnable; GlobalEventExecutor.java:106
    #6 io.netty.util.concurrent.GlobalEventExecutor$TaskRunner.run()V GlobalEventExecutor.java:240
    #7 io.netty.util.internal.ThreadExecutorMap$2.run()V ThreadExecutorMap.java:74
    #8 io.netty.util.concurrent.FastThreadLocalRunnable.run()V FastThreadLocalRunnable.java:30
    #9 java.lang.Thread.run()V Thread.java:835
    #10 (Generated Stub) <null>

  Previous read of size 4 at 0x0000cea08774 by thread T110:
    #0 io.netty.util.internal.DefaultPriorityQueue.size()I DefaultPriorityQueue.java:46
    #1 io.netty.util.concurrent.GlobalEventExecutor$TaskRunner.run()V GlobalEventExecutor.java:263
    #2 io.netty.util.internal.ThreadExecutorMap$2.run()V ThreadExecutorMap.java:74
    #3 io.netty.util.concurrent.FastThreadLocalRunnable.run()V FastThreadLocalRunnable.java:30
    #4 java.lang.Thread.run()V Thread.java:835
    #5 (Generated Stub) <null>

The race is legit, but benign. To trigger it requires a TaskRunner to
begin exiting and set 'started' to false, more work to be scheduled
which starts a new TaskRunner, that work then needs to schedule
additional work which modifies 'scheduledTaskQueue', and then the
original TaskRunner checks 'scheduledTaskQueue'. But there is no danger
to this race as it can only produce a false negative in the condition
which causes the code to CAS 'started' which is thread-safe.

Modifications:

Delete problematic references to scheduledTaskQueue. The only way
scheduledTaskQueue could be modified since the last check is if another
TaskRunner is running, in which case the current TaskRunner doesn't
care.

Result:

Data-race free code, and a bit less code to boot.

Motivation:

A race detector discovered a data race in GlobalEventExecutor present in
netty 4.1.51.Final:

```
  Write of size 4 at 0x0000cea08774 by thread T103:
    #0 io.netty.util.internal.DefaultPriorityQueue.poll()Lio/netty/util/internal/PriorityQueueNode; DefaultPriorityQueue.java:113
    netty#1 io.netty.util.internal.DefaultPriorityQueue.poll()Ljava/lang/Object; DefaultPriorityQueue.java:31
    netty#2 java.util.AbstractQueue.remove()Ljava/lang/Object; AbstractQueue.java:113
    netty#3 io.netty.util.concurrent.AbstractScheduledEventExecutor.pollScheduledTask(J)Ljava/lang/Runnable; AbstractScheduledEventExecutor.java:133
    netty#4 io.netty.util.concurrent.GlobalEventExecutor.fetchFromScheduledTaskQueue()V GlobalEventExecutor.java:119
    netty#5 io.netty.util.concurrent.GlobalEventExecutor.takeTask()Ljava/lang/Runnable; GlobalEventExecutor.java:106
    netty#6 io.netty.util.concurrent.GlobalEventExecutor$TaskRunner.run()V GlobalEventExecutor.java:240
    netty#7 io.netty.util.internal.ThreadExecutorMap$2.run()V ThreadExecutorMap.java:74
    netty#8 io.netty.util.concurrent.FastThreadLocalRunnable.run()V FastThreadLocalRunnable.java:30
    netty#9 java.lang.Thread.run()V Thread.java:835
    netty#10 (Generated Stub) <null>

  Previous read of size 4 at 0x0000cea08774 by thread T110:
    #0 io.netty.util.internal.DefaultPriorityQueue.size()I DefaultPriorityQueue.java:46
    netty#1 io.netty.util.concurrent.GlobalEventExecutor$TaskRunner.run()V GlobalEventExecutor.java:263
    netty#2 io.netty.util.internal.ThreadExecutorMap$2.run()V ThreadExecutorMap.java:74
    netty#3 io.netty.util.concurrent.FastThreadLocalRunnable.run()V FastThreadLocalRunnable.java:30
    netty#4 java.lang.Thread.run()V Thread.java:835
    netty#5 (Generated Stub) <null>
```

The race is legit, but benign. To trigger it requires a TaskRunner to
begin exiting and set 'started' to false, more work to be scheduled
which starts a new TaskRunner, that work then needs to schedule
additional work which modifies 'scheduledTaskQueue', and then the
original TaskRunner checks 'scheduledTaskQueue'. But there is no danger
to this race as it can only produce a false negative in the condition
which causes the code to CAS 'started' which is thread-safe.

Modifications:

Delete problematic references to scheduledTaskQueue. The only way
scheduledTaskQueue could be modified since the last check is if another
TaskRunner is running, in which case the current TaskRunner doesn't
care.

Result:

Data-race free code, and a bit less code to boot.
@normanmaurer normanmaurer merged commit 4f6e62d into netty:4.1 Dec 24, 2020
@normanmaurer normanmaurer added this to the 4.1.57.Final milestone Dec 24, 2020
normanmaurer pushed a commit that referenced this pull request Dec 24, 2020
…tor (#10890)

Motivation:

A race detector discovered a data race in GlobalEventExecutor present in
netty 4.1.51.Final:

```
  Write of size 4 at 0x0000cea08774 by thread T103:
    #0 io.netty.util.internal.DefaultPriorityQueue.poll()Lio/netty/util/internal/PriorityQueueNode; DefaultPriorityQueue.java:113
    #1 io.netty.util.internal.DefaultPriorityQueue.poll()Ljava/lang/Object; DefaultPriorityQueue.java:31
    #2 java.util.AbstractQueue.remove()Ljava/lang/Object; AbstractQueue.java:113
    #3 io.netty.util.concurrent.AbstractScheduledEventExecutor.pollScheduledTask(J)Ljava/lang/Runnable; AbstractScheduledEventExecutor.java:133
    #4 io.netty.util.concurrent.GlobalEventExecutor.fetchFromScheduledTaskQueue()V GlobalEventExecutor.java:119
    #5 io.netty.util.concurrent.GlobalEventExecutor.takeTask()Ljava/lang/Runnable; GlobalEventExecutor.java:106
    #6 io.netty.util.concurrent.GlobalEventExecutor$TaskRunner.run()V GlobalEventExecutor.java:240
    #7 io.netty.util.internal.ThreadExecutorMap$2.run()V ThreadExecutorMap.java:74
    #8 io.netty.util.concurrent.FastThreadLocalRunnable.run()V FastThreadLocalRunnable.java:30
    #9 java.lang.Thread.run()V Thread.java:835
    #10 (Generated Stub) <null>

  Previous read of size 4 at 0x0000cea08774 by thread T110:
    #0 io.netty.util.internal.DefaultPriorityQueue.size()I DefaultPriorityQueue.java:46
    #1 io.netty.util.concurrent.GlobalEventExecutor$TaskRunner.run()V GlobalEventExecutor.java:263
    #2 io.netty.util.internal.ThreadExecutorMap$2.run()V ThreadExecutorMap.java:74
    #3 io.netty.util.concurrent.FastThreadLocalRunnable.run()V FastThreadLocalRunnable.java:30
    #4 java.lang.Thread.run()V Thread.java:835
    #5 (Generated Stub) <null>
```

The race is legit, but benign. To trigger it requires a TaskRunner to
begin exiting and set 'started' to false, more work to be scheduled
which starts a new TaskRunner, that work then needs to schedule
additional work which modifies 'scheduledTaskQueue', and then the
original TaskRunner checks 'scheduledTaskQueue'. But there is no danger
to this race as it can only produce a false negative in the condition
which causes the code to CAS 'started' which is thread-safe.

Modifications:

Delete problematic references to scheduledTaskQueue. The only way
scheduledTaskQueue could be modified since the last check is if another
TaskRunner is running, in which case the current TaskRunner doesn't
care.

Result:

Data-race free code, and a bit less code to boot.
@ejona86 ejona86 deleted the globaleventexecutor-race branch January 5, 2021 18:15
raidyue pushed a commit to raidyue/netty that referenced this pull request Jul 8, 2022
…tor (netty#10890)

Motivation:

A race detector discovered a data race in GlobalEventExecutor present in
netty 4.1.51.Final:

```
  Write of size 4 at 0x0000cea08774 by thread T103:
    #0 io.netty.util.internal.DefaultPriorityQueue.poll()Lio/netty/util/internal/PriorityQueueNode; DefaultPriorityQueue.java:113
    netty#1 io.netty.util.internal.DefaultPriorityQueue.poll()Ljava/lang/Object; DefaultPriorityQueue.java:31
    netty#2 java.util.AbstractQueue.remove()Ljava/lang/Object; AbstractQueue.java:113
    netty#3 io.netty.util.concurrent.AbstractScheduledEventExecutor.pollScheduledTask(J)Ljava/lang/Runnable; AbstractScheduledEventExecutor.java:133
    netty#4 io.netty.util.concurrent.GlobalEventExecutor.fetchFromScheduledTaskQueue()V GlobalEventExecutor.java:119
    netty#5 io.netty.util.concurrent.GlobalEventExecutor.takeTask()Ljava/lang/Runnable; GlobalEventExecutor.java:106
    netty#6 io.netty.util.concurrent.GlobalEventExecutor$TaskRunner.run()V GlobalEventExecutor.java:240
    netty#7 io.netty.util.internal.ThreadExecutorMap$2.run()V ThreadExecutorMap.java:74
    netty#8 io.netty.util.concurrent.FastThreadLocalRunnable.run()V FastThreadLocalRunnable.java:30
    netty#9 java.lang.Thread.run()V Thread.java:835
    netty#10 (Generated Stub) <null>

  Previous read of size 4 at 0x0000cea08774 by thread T110:
    #0 io.netty.util.internal.DefaultPriorityQueue.size()I DefaultPriorityQueue.java:46
    netty#1 io.netty.util.concurrent.GlobalEventExecutor$TaskRunner.run()V GlobalEventExecutor.java:263
    netty#2 io.netty.util.internal.ThreadExecutorMap$2.run()V ThreadExecutorMap.java:74
    netty#3 io.netty.util.concurrent.FastThreadLocalRunnable.run()V FastThreadLocalRunnable.java:30
    netty#4 java.lang.Thread.run()V Thread.java:835
    netty#5 (Generated Stub) <null>
```

The race is legit, but benign. To trigger it requires a TaskRunner to
begin exiting and set 'started' to false, more work to be scheduled
which starts a new TaskRunner, that work then needs to schedule
additional work which modifies 'scheduledTaskQueue', and then the
original TaskRunner checks 'scheduledTaskQueue'. But there is no danger
to this race as it can only produce a false negative in the condition
which causes the code to CAS 'started' which is thread-safe.

Modifications:

Delete problematic references to scheduledTaskQueue. The only way
scheduledTaskQueue could be modified since the last check is if another
TaskRunner is running, in which case the current TaskRunner doesn't
care.

Result:

Data-race free code, and a bit less code to boot.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants