fifo.Close(): prevent possible panic if fifo is nil#32
fifo.Close(): prevent possible panic if fifo is nil#32dmcgowan merged 1 commit intocontainerd:mainfrom
Conversation
|
@cpuguy83 @AkihiroSuda ptal |
| func (f *fifo) Close() (retErr error) { | ||
| for { | ||
| if f == nil { | ||
| return |
There was a problem hiding this comment.
This method can also be used before open(2) has returned and fifo was never opened.
Also not sure if we should
- return an error in this case (currently it would use
retErr, if set) - return
nil(which would unsetretErrif it was set before)
There was a problem hiding this comment.
I think the caller should check for nil.
There was a problem hiding this comment.
Could it become nil while we're in this for{} loop?
There was a problem hiding this comment.
I should hope not... if so then that would be the bug that should be fixed.
There was a problem hiding this comment.
Hmm... so I suspect that something like that actually happens (somehow); following that trace, the panic followed from this code; https://github.com/containerd/containerd/blob/269548fa27e0089a8b8278fc4fc781d7f65a939b/cio/io.go#L199-L206
func (c *cio) Close() error {
var lastErr error
for _, closer := range c.closers {
if closer == nil {
continue
}
if err := closer.Close(); err != nil {
lastErr = err
}
}
return lastErr
}So there's already a nil check there. So only other explanation would be there there's some concurrency happening, in which case do we need locking?
There was a problem hiding this comment.
could have been explain by the infamous interface is nil problem, either we make sure that we dont pass nil io.Closer to cio, or do a type check ( yuck), or we check inside fifo.Close for nil.
There was a problem hiding this comment.
Ah, sorry, missed your comment here, @dqminh - do you mean we should use a different fix, or is the current "fix" ok?
There was a problem hiding this comment.
i think the best case is that we need to make sure the caller doesn't pass a nil interface as that can be nasty too if we call anything other than Close ( which is fixed in this way ). The current check we are doing here is not the end of the world, but ideally i don't think we need it at this place.
|
OK, looks like this is failing; |
I'm not sure if this is the right approach, and synchronisation should probably
be added elsewhere to fix the underlying issue.
Trying to prevent a panic that was seen on container restore in th docker daemon:
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x5586c892a7a4]
goroutine 420 [running]:
github.com/docker/docker/vendor/github.com/containerd/fifo.(*fifo).Close(0x0, 0x0, 0x0)
/go/src/github.com/docker/docker/vendor/github.com/containerd/fifo/fifo.go:208 +0x44
github.com/docker/docker/vendor/github.com/containerd/containerd/cio.(*cio).Close(0xc000d06f60, 0x5586cb5654d0, 0xc000d8e9e8)
/go/src/github.com/docker/docker/vendor/github.com/containerd/containerd/cio/io.go:203 +0x90
github.com/docker/docker/libcontainerd/remote.(*client).Restore.func1(0xc0008bf820, 0xc0008a2040)
/go/src/github.com/docker/docker/libcontainerd/remote/client.go:86 +0x5a
github.com/docker/docker/libcontainerd/remote.(*client).Restore(0xc00098e5b0, 0x5586cb61c7c0, 0xc000052088, 0xc0011b6500, 0x40, 0xc0008bf810, 0x5586cb05cf00, 0xffffffffffffffff, 0x0, 0x0, ...)
/go/src/github.com/docker/docker/libcontainerd/remote/client.go:107 +0x923
github.com/docker/docker/daemon.(*Daemon).restore.func3(0xc00079d9e0, 0xc000a38230, 0xc00000c1e0, 0xc00079d9a8, 0xc000d84f00, 0xc000d84ed0, 0xc000d84ea0, 0xc00128a280)
/go/src/github.com/docker/docker/daemon/daemon.go:351 +0x48a
created by github.com/docker/docker/daemon.(*Daemon).restore
/go/src/github.com/docker/docker/daemon/daemon.go:319 +0x4b3
If the fifo is nil, there's nothing to be done in Close(), so returning early
in that situation.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
bb32f37 to
80cf87d
Compare
|
Hmm... looks like there's some flakiness; re-pushed, and now it passes; could it be related to Go 1.14/ Go 1.15? |
|
Ran Tibor's script from https://gist.github.com/tiborvass/eb0a4054679a43aaca22690a7c4452ed on this repository, in case it's useful; Details |
|
@cpuguy83 @AkihiroSuda ptal |
|
@AkihiroSuda any thoughts on the discussion above? #32 (comment) (i.e.; would fixes be needed elsewhere (as well?) |
kzys
left a comment
There was a problem hiding this comment.
Looks good to me since it would fix a production issue. I'm in "the caller should check that" camp though :)
|
@thaJeztah Did this make it into a release? I'm still seeing these issues occasionally on v20.10.12, with containerd 1.4.12 7b11cfaabd73bb80907dd23182b9347b4245eb5d. I can't seem to find a way to resolve the issue without restarting the entire server, which is very costly. |
|
Not this was not in a release yet; v1.0.0...39bc37d |
I'm not sure if this is the right approach, and synchronisation should probably
be added elsewhere to fix the underlying issue.
Trying to prevent a panic that was seen on container restore in th docker daemon:
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x5586c892a7a4]
goroutine 420 [running]:
github.com/docker/docker/vendor/github.com/containerd/fifo.(*fifo).Close(0x0, 0x0, 0x0)
/go/src/github.com/docker/docker/vendor/github.com/containerd/fifo/fifo.go:208 +0x44
github.com/docker/docker/vendor/github.com/containerd/containerd/cio.(*cio).Close(0xc000d06f60, 0x5586cb5654d0, 0xc000d8e9e8)
/go/src/github.com/docker/docker/vendor/github.com/containerd/containerd/cio/io.go:203 +0x90
github.com/docker/docker/libcontainerd/remote.(*client).Restore.func1(0xc0008bf820, 0xc0008a2040)
/go/src/github.com/docker/docker/libcontainerd/remote/client.go:86 +0x5a
github.com/docker/docker/libcontainerd/remote.(*client).Restore(0xc00098e5b0, 0x5586cb61c7c0, 0xc000052088, 0xc0011b6500, 0x40, 0xc0008bf810, 0x5586cb05cf00, 0xffffffffffffffff, 0x0, 0x0, ...)
/go/src/github.com/docker/docker/libcontainerd/remote/client.go:107 +0x923
github.com/docker/docker/daemon.(*Daemon).restore.func3(0xc00079d9e0, 0xc000a38230, 0xc00000c1e0, 0xc00079d9a8, 0xc000d84f00, 0xc000d84ed0, 0xc000d84ea0, 0xc00128a280)
/go/src/github.com/docker/docker/daemon/daemon.go:351 +0x48a
created by github.com/docker/docker/daemon.(*Daemon).restore
/go/src/github.com/docker/docker/daemon/daemon.go:319 +0x4b3
If the fifo is nil, there's nothing to be done in Close(), so returning early
in that situation.
Backport: <containerd/fifo#32>
SUSE-Bugs: bsc#1200022
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
I'm not sure if this is the right approach, and synchronisation should probably
be added elsewhere to fix the underlying issue.
Trying to prevent a panic that was seen on container restore in th docker daemon:
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x5586c892a7a4]
goroutine 420 [running]:
github.com/docker/docker/vendor/github.com/containerd/fifo.(*fifo).Close(0x0, 0x0, 0x0)
/go/src/github.com/docker/docker/vendor/github.com/containerd/fifo/fifo.go:208 +0x44
github.com/docker/docker/vendor/github.com/containerd/containerd/cio.(*cio).Close(0xc000d06f60, 0x5586cb5654d0, 0xc000d8e9e8)
/go/src/github.com/docker/docker/vendor/github.com/containerd/containerd/cio/io.go:203 +0x90
github.com/docker/docker/libcontainerd/remote.(*client).Restore.func1(0xc0008bf820, 0xc0008a2040)
/go/src/github.com/docker/docker/libcontainerd/remote/client.go:86 +0x5a
github.com/docker/docker/libcontainerd/remote.(*client).Restore(0xc00098e5b0, 0x5586cb61c7c0, 0xc000052088, 0xc0011b6500, 0x40, 0xc0008bf810, 0x5586cb05cf00, 0xffffffffffffffff, 0x0, 0x0, ...)
/go/src/github.com/docker/docker/libcontainerd/remote/client.go:107 +0x923
github.com/docker/docker/daemon.(*Daemon).restore.func3(0xc00079d9e0, 0xc000a38230, 0xc00000c1e0, 0xc00079d9a8, 0xc000d84f00, 0xc000d84ed0, 0xc000d84ea0, 0xc00128a280)
/go/src/github.com/docker/docker/daemon/daemon.go:351 +0x48a
created by github.com/docker/docker/daemon.(*Daemon).restore
/go/src/github.com/docker/docker/daemon/daemon.go:319 +0x4b3
If the fifo is nil, there's nothing to be done in Close(), so returning early
in that situation.
Backport: <containerd/fifo#32>
SUSE-Bugs: bsc#1200022
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
I'm not sure if this is the right approach, and synchronisation should probably
be added elsewhere to fix the underlying issue.
Trying to prevent a panic that was seen on container restore in th docker daemon:
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x5586c892a7a4]
goroutine 420 [running]:
github.com/docker/docker/vendor/github.com/containerd/fifo.(*fifo).Close(0x0, 0x0, 0x0)
/go/src/github.com/docker/docker/vendor/github.com/containerd/fifo/fifo.go:208 +0x44
github.com/docker/docker/vendor/github.com/containerd/containerd/cio.(*cio).Close(0xc000d06f60, 0x5586cb5654d0, 0xc000d8e9e8)
/go/src/github.com/docker/docker/vendor/github.com/containerd/containerd/cio/io.go:203 +0x90
github.com/docker/docker/libcontainerd/remote.(*client).Restore.func1(0xc0008bf820, 0xc0008a2040)
/go/src/github.com/docker/docker/libcontainerd/remote/client.go:86 +0x5a
github.com/docker/docker/libcontainerd/remote.(*client).Restore(0xc00098e5b0, 0x5586cb61c7c0, 0xc000052088, 0xc0011b6500, 0x40, 0xc0008bf810, 0x5586cb05cf00, 0xffffffffffffffff, 0x0, 0x0, ...)
/go/src/github.com/docker/docker/libcontainerd/remote/client.go:107 +0x923
github.com/docker/docker/daemon.(*Daemon).restore.func3(0xc00079d9e0, 0xc000a38230, 0xc00000c1e0, 0xc00079d9a8, 0xc000d84f00, 0xc000d84ed0, 0xc000d84ea0, 0xc00128a280)
/go/src/github.com/docker/docker/daemon/daemon.go:351 +0x48a
created by github.com/docker/docker/daemon.(*Daemon).restore
/go/src/github.com/docker/docker/daemon/daemon.go:319 +0x4b3
If the fifo is nil, there's nothing to be done in Close(), so returning early
in that situation.
Backport: <containerd/fifo#32>
SUSE-Bugs: bsc#1200022
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
I'm not sure if this is the right approach, and synchronisation should probably
be added elsewhere to fix the underlying issue.
Trying to prevent a panic that was seen on container restore in th docker daemon:
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x5586c892a7a4]
goroutine 420 [running]:
github.com/docker/docker/vendor/github.com/containerd/fifo.(*fifo).Close(0x0, 0x0, 0x0)
/go/src/github.com/docker/docker/vendor/github.com/containerd/fifo/fifo.go:208 +0x44
github.com/docker/docker/vendor/github.com/containerd/containerd/cio.(*cio).Close(0xc000d06f60, 0x5586cb5654d0, 0xc000d8e9e8)
/go/src/github.com/docker/docker/vendor/github.com/containerd/containerd/cio/io.go:203 +0x90
github.com/docker/docker/libcontainerd/remote.(*client).Restore.func1(0xc0008bf820, 0xc0008a2040)
/go/src/github.com/docker/docker/libcontainerd/remote/client.go:86 +0x5a
github.com/docker/docker/libcontainerd/remote.(*client).Restore(0xc00098e5b0, 0x5586cb61c7c0, 0xc000052088, 0xc0011b6500, 0x40, 0xc0008bf810, 0x5586cb05cf00, 0xffffffffffffffff, 0x0, 0x0, ...)
/go/src/github.com/docker/docker/libcontainerd/remote/client.go:107 +0x923
github.com/docker/docker/daemon.(*Daemon).restore.func3(0xc00079d9e0, 0xc000a38230, 0xc00000c1e0, 0xc00079d9a8, 0xc000d84f00, 0xc000d84ed0, 0xc000d84ea0, 0xc00128a280)
/go/src/github.com/docker/docker/daemon/daemon.go:351 +0x48a
created by github.com/docker/docker/daemon.(*Daemon).restore
/go/src/github.com/docker/docker/daemon/daemon.go:319 +0x4b3
If the fifo is nil, there's nothing to be done in Close(), so returning early
in that situation.
Backport: <containerd/fifo#32>
SUSE-Bugs: bsc#1200022
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
I'm not sure if this is the right approach, and synchronisation should probably
be added elsewhere to fix the underlying issue.
Trying to prevent a panic that was seen on container restore in th docker daemon:
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x5586c892a7a4]
goroutine 420 [running]:
github.com/docker/docker/vendor/github.com/containerd/fifo.(*fifo).Close(0x0, 0x0, 0x0)
/go/src/github.com/docker/docker/vendor/github.com/containerd/fifo/fifo.go:208 +0x44
github.com/docker/docker/vendor/github.com/containerd/containerd/cio.(*cio).Close(0xc000d06f60, 0x5586cb5654d0, 0xc000d8e9e8)
/go/src/github.com/docker/docker/vendor/github.com/containerd/containerd/cio/io.go:203 +0x90
github.com/docker/docker/libcontainerd/remote.(*client).Restore.func1(0xc0008bf820, 0xc0008a2040)
/go/src/github.com/docker/docker/libcontainerd/remote/client.go:86 +0x5a
github.com/docker/docker/libcontainerd/remote.(*client).Restore(0xc00098e5b0, 0x5586cb61c7c0, 0xc000052088, 0xc0011b6500, 0x40, 0xc0008bf810, 0x5586cb05cf00, 0xffffffffffffffff, 0x0, 0x0, ...)
/go/src/github.com/docker/docker/libcontainerd/remote/client.go:107 +0x923
github.com/docker/docker/daemon.(*Daemon).restore.func3(0xc00079d9e0, 0xc000a38230, 0xc00000c1e0, 0xc00079d9a8, 0xc000d84f00, 0xc000d84ed0, 0xc000d84ea0, 0xc00128a280)
/go/src/github.com/docker/docker/daemon/daemon.go:351 +0x48a
created by github.com/docker/docker/daemon.(*Daemon).restore
/go/src/github.com/docker/docker/daemon/daemon.go:319 +0x4b3
If the fifo is nil, there's nothing to be done in Close(), so returning early
in that situation.
Backport: <containerd/fifo#32>
SUSE-Bugs: bsc#1200022
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
relates to docker/for-linux#1186
I'm not sure if this is the right approach, and synchronisation should probably be added elsewhere to fix the underlying issue.
Trying to prevent a panic that was seen on container restore in th docker daemon:
If the fifo is nil, there's nothing to be done in Close(), so returning early in that situation.