-
Notifications
You must be signed in to change notification settings - Fork 886
Fix disconnects on service updates #2074
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Related to moby/moby#30321, swarm removes the network as the container is shutting down, not honouring the `stop_grace_period`. Signed-off-by: Yarek Tyshchenko <yarek.tyshchenko@awin.com>
Codecov Report
@@ Coverage Diff @@
## master #2074 +/- ##
=========================================
Coverage ? 40.43%
=========================================
Files ? 138
Lines ? 22198
Branches ? 0
=========================================
Hits ? 8975
Misses ? 11906
Partials ? 1317
Continue to review full report at Codecov.
|
|
@YarekTyshchenko we moved away from this appoach to address the problem. Can you explain your use case ? |
|
@abhi The issue that we are seeing is that we see cut connections on swarm updates. This is what I think is happening:
The couple of connections that were being processed while the switchover was happening have failed, but what we expect is that the connections won't be cut until the old container actually shuts down. The timeout for container shutdown honours I want to mention that this has nothing to do with whats happening inside the container, because if we trap the stop signal and do nothing, the connections still get cut off. |
|
@YarekTyshchenko This was the original change I had proposed and we have seen lot of race conditions in this approach. |
|
@abhi Hmm, I see. It was one of your patches that I took this change from, it fixes the issue, but I understand that this isn't the way now, it may be better to close this issue now. What is your view on how should this problem be addressed? I'm thinking that it this is more serious than people realise, as it doesn't show up on benchmarks where requests are returned instantly, which means as soon as people deploy this with real workloads they will start seeing failures.
As swarm router is acting as a load balancer it needs to understand three states for each backend: Enabled, Draining, Disabled. Perhaps there is a way to get moby to do the idiomatic thing here and set containers to drain during shutdown. We can't be the only ones that are facing this problem? Thanks for looking at this. |
|
@YarekTyshchenko we do recognize this is an issue. With the current design we probably will end up breaking one of the scenarios. Thank you for bringing this up. We will post an update when we have a cleaner solution to solve all the 3 scenarios. Stay Tuned. |
|
@YarekTyshchenko just to give you an update on this. The issue is on IPVS behavior, when the service is removed, looks like it stop the forwarding of the packets also of already established connections. |
|
@fcrisciani Fantastic! I'm looking forward to testing the patch |
Related to moby/moby#30321, swarm removes the network as the container
is shutting down, not honouring the
stop_grace_period.Signed-off-by: Yarek Tyshchenko yarek.tyshchenko@awin.com