-
Notifications
You must be signed in to change notification settings - Fork 18.9k
Description
Description
Hello, I found in the function clusterLeave()
moby/libnetwork/networkdb/cluster.go
Line 222 in ceefb7d
| func (nDB *NetworkDB) clusterLeave() error { |
func (nDB *NetworkDB) clusterLeave() error {
mlist := nDB.memberlist
if err := nDB.sendNodeEvent(NodeEventTypeLeave); err != nil {
log.G(context.TODO()).Errorf("failed to send node leave: %v", err)
}
if err := mlist.Leave(time.Second); err != nil {
return err
}
// cancel the context
nDB.cancelCtx()
for _, t := range nDB.tickers {
t.Stop()
}
return mlist.Shutdown()
}If the mlist.Leave() return err, the nDB.cancelCtx() below will not get executed.
moby/libnetwork/networkdb/cluster.go
Lines 229 to 234 in ceefb7d
| if err := mlist.Leave(time.Second); err != nil { | |
| return err | |
| } | |
| // cancel the context | |
| nDB.cancelCtx() |
And it will lead the <-nDB.ctx.Done in triggerFunc() blocked persistently, so the goroutine leak.
moby/libnetwork/networkdb/cluster.go
Line 176 in ceefb7d
| go nDB.triggerFunc(trigger.interval, t.C, trigger.fn) |
blocking position:
moby/libnetwork/networkdb/cluster.go
Lines 251 to 258 in ceefb7d
| for { | |
| select { | |
| case <-C: | |
| f() | |
| case <-nDB.ctx.Done(): | |
| return | |
| } | |
| } |
Reproduce
I reproduce the bug by goleak.
Firstly, I modified the judge condition from err != nil to err == nil. Because I don't know how to let err != nil, the change only to make the return err can be executed easier. I'm not sure whether the change can lead other influences.
Normally:
if err := mlist.Leave(time.Second); err != nil {
return err
}After modified:
if err := mlist.Leave(time.Second); err == nil {
return err
}Then I used goleak to test in these test function related the funciton.
moby/libnetwork/networkdb/networkdb_test.go
Line 180 in ceefb7d
| func TestNetworkDBSimple(t *testing.T) { |
Like this:

The result shows that there is a bug at the <-nDB.ctx.Done
Expected behavior
No response
docker version
latestdocker info
latestAdditional Info
In short, I think the bug is caused by return but have not called the cancelFunc. I have tried to describe it in detail.