-
Notifications
You must be signed in to change notification settings - Fork 3.8k
cri/server/podsandbox: disable event subscriber #12400
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cri/server/podsandbox: disable event subscriber #12400
Conversation
50d29e1 to
8ad3229
Compare
8ad3229 to
b2bcbf3
Compare
b2bcbf3 to
c84eeac
Compare
| eventMonitor.Start() | ||
| c.eventMonitor = eventMonitor | ||
|
|
||
| // no need to subscribe exit event for pause container |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // no need to subscribe exit event for pause container | |
| // no need to subscribe exit event for pause container, because ___ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Updated.
|
@fuweid I have questions wrt "ttrpc: closed" error.
Thanks! |
why not set ID as correct code? In the issue #12344, there was no leak in version 1.7, so I think here would not be an root cause. And it had ran well long time. In version 2.x, the cause of the error maybe was the lack of info when delete task and shim, which resulted in the RPC not being sent. And also it's reason I submitted PR 12397 |
|
@henry118 @yylt I need to run that test tonight to confirm that change is valid. The root cause is about two concurrrent task deletions. It's mentioned by @ningmingxiao before. Why does two concurrent task deletions cause leaky issue? Updated(2025-10-24): checked my pr's description.
It's introduced by podsandbox change since 2.0. |
This is the part I don't quite understand. Did the shutdown succeed?
Does the shutdown ever close the client side ttrpc connection? I don't see that in the following code: containerd/core/runtime/v2/shim.go Lines 497 to 505 in 3d0df86
|
I think I finally get it. The connection was closed by the following shim instance deletion: containerd/core/runtime/v2/shim.go Lines 566 to 578 in 3d0df86
|
this shoutdown never send to shim. ttrpc closed maybe shim dir is deleted @henry118 |
We have individual goroutine for each sandbox container. If there is any
error in handler, that goroutine will put event in that backoff queue.
So we don't need event subscriber for podsandbox. Otherwise, there will
be two goroutines to cleanup sandbox container.
```
>>>> From EventMonitor
time="2025-10-23T19:30:59.626254404Z" level=debug msg="Received containerd event timestamp - 2025-10-23 19:30:59.624494674 +0000 UTC, namespace - \"k8s.io\", topic - \"/tasks/exit\""
time="2025-10-23T19:30:59.626301912Z" level=debug msg="TaskExit event in podsandbox handler container_id:\"22e15114133e4d461ab380654fb76f3e73d3e0323989c422fa17882762979ccf\" id:\"22e15114133e4d461ab380654fb76f3e73d3e0323989c422fa17882762979ccf\" pid:203121 exit_status:137 exited_at:{seconds:1761247859 nanos:624467824}"
>>> If EventMonitor handles task exit well, it will close ttrpc
connection and then waitSandboxExit could encounter ttrpc-closed error
time="2025-10-23T19:30:59.688031150Z" level=error msg="failed to delete task" error="ttrpc: closed" id=22e15114133e4d461ab380654fb76f3e73d3e0323989c422fa17882762979ccf
```
If both task.Delete calls fail but the shim has already been shut down, it
could trigger a new task.Exit event sent by cleanupAfterDeadShim. This would
result in three events in the EventMonitor's backoff queue, which is unnecessary
and could cause confusion due to duplicate events.
The worst-case scenario caused by two concurrent task.Delete calls is a shim
leak. The timeline for this scenario is as follows:
| Timestamp | Component | Action | Result |
| ------ | ----------- | -------- | -------- |
| T1 | EventMonitor | Sends `task.Delete` | Marked as Req-1 |
| T2 | waitSandboxExit | Sends `task.Delete` | Marked as Req-2 |
| T3 | containerd-shim | Handles Req-2 | Container transitions from stopped to deleted |
| T4 | containerd-shim | Handles Req-1 | Fails - container already deleted<br>Returns error: `cannot delete a deleted process: not found` |
| T5 | EventMonitor | Receives `not found` error | - |
| T6 | EventMonitor | Sends `shim.Shutdown` request | No-op (active container record still exists) |
| T7 | EventMonitor | Closes ttrpc connection | Clean container state dir |
| T8 | containerd-shim | Handles Req-2 | Removes container record from memory |
| T9 | waitSandboxExit | Receives error | Error: `ttrpc: closed` |
| T10 | waitSandboxExit | Sends `shim.Shutdown` request | Fails (connection already closed) |
| T11 | waitSandboxExit | Closes ttrpc connection | No-op (already closed) |
The containerd-shim is still running because shim.Shutdown was sent at T6
before T8. Because container's state dir is deleted at T7, it's unable to clean
it up after containerd restarted.
We should avoid concurrent task.Delete calls here.
I also add subcommand - shutdown - in `ctr shim` for debug.
Fixed: containerd#12344
Signed-off-by: Wei Fu <fuweid89@gmail.com>
c84eeac to
2042e80
Compare
|
@henry118 @yylt @ningmingxiao I've updated the description with a timeline table. I hope this addresses your question." |
|
Thanks @fuweid. Good job on the investigation and explanation of the problem. It all makes sense now. Also agreed that the task struct is not designed to be reentrant, so avoiding concurrent access to the object is a better approach. LGTM. |
mikebrow
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.. like the new shutdown command..
We have individual goroutine for each sandbox container. If there is any
error in handler, that goroutine will put event in that backoff queue.
So we don't need event subscriber for podsandbox. Otherwise, there will
be two goroutines to cleanup sandbox container.
If both task.Delete calls fail but the shim has already been shut down, it
could trigger a new task.Exit event sent by cleanupAfterDeadShim. This would
result in three events in the EventMonitor's backoff queue, which is unnecessary
and could cause confusion due to duplicate events.
The worst-case scenario caused by two concurrent task.Delete calls is a shim
leak. The timeline for this scenario is as follows:
task.Deletetask.DeleteReturns error:
cannot delete a deleted process: not foundnot founderrorshim.Shutdownrequestttrpc: closedshim.ShutdownrequestThe containerd-shim is still running because shim.Shutdown was sent at T6
before T8. Because container's state dir is deleted at T7, it's unable to clean
it up after containerd restarted.
We should avoid concurrent task.Delete calls here.
I also add subcommand - shutdown - in
ctr shimfor debug.Fixes: #12344
Fixes: #12404
Signed-off-by: Wei Fu fuweid89@gmail.com