Skip to content

Graceful shutdown is not working as expected with default setup. #4002

@davem-git

Description

@davem-git

Description:
I'm working on implementing envoy-gateway as a replacement for our nginx controller. I have some basic tests, a pod that returns a json block when hit an endpoint. Using K6 as a testing sweet. I set up the following test.

import http from 'k6/http';
import { sleep } from 'k6';

export const options = {
  stages: [
    { duration: '2m', target: 50 }, // ramp-up to 50 users
    { duration: '6m', target: 50 }, // stay at 50 users
    { duration: '2m', target: 0 },  // ramp-down to 0 users
  ],
};

export default function () {
  http.get(<url>/);
  sleep(1);
}

When I run this test and start a rollout restart of the envoy pods. I get the following errors

WARN[0070] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49547-><valid public address>:443: read: connection reset by peer"
WARN[0070] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49570-><valid public address>:443: read: connection reset by peer"
WARN[0071] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49587-><valid public address>:443: read: connection reset by peer"
WARN[0080] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49573-><valid public address>:443: read: connection reset by peer"
WARN[0080] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49601-><valid public address>:443: read: connection reset by peer"
WARN[0080] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49594-><valid public address>:443: read: connection reset by peer"
WARN[0080] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49555-><valid public address>:443: read: connection reset by peer"
WARN[0080] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49595-><valid public address>:443: read: connection reset by peer"
WARN[0080] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49589-><valid public address>:443: read: connection reset by peer"
WARN[0082] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49572-><valid public address>:443: read: connection reset by peer"
WARN[0082] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49563-><valid public address>:443: read: connection reset by peer"
WARN[0083] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49566-><valid public address>:443: read: connection reset by peer"
WARN[0488] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49619-><valid public address>:443: read: connection reset by peer"
WARN[0489] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49636-><valid public address>:443: read: connection reset by peer"
WARN[0489] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49623-><valid public address>:443: read: connection reset by peer"
WARN[0489] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49667-><valid public address>:443: read: connection reset by peer"
WARN[0489] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49611-><valid public address>:443: read: connection reset by peer"
WARN[0489] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49604-><valid public address>:443: read: connection reset by peer"
WARN[0489] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49637-><valid public address>:443: read: connection reset by peer"
WARN[0489] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49649-><valid public address>:443: read: connection reset by peer"
WARN[0498] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49642-><valid public address>:443: read: connection reset by peer"
WARN[0498] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49624-><valid public address>:443: read: connection reset by peer"
WARN[0498] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49598-><valid public address>:443: read: connection reset by peer"
WARN[0498] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49830-><valid public address>:443: read: connection reset by peer"
WARN[0498] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49634-><valid public address>:443: read: connection reset by peer"
WARN[0498] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49617-><valid public address>:443: read: connection reset by peer"
WARN[0502] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49643-><valid public address>:443: read: connection reset by peer"
WARN[0502] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49651-><valid public address>:443: read: connection reset by peer"
WARN[0503] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49630-><valid public address>:443: read: connection reset by peer"
WARN[0503] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49660-><valid public address>:443: read: connection reset by peer"
WARN[0503] Request Failed                                error="Get \"<url>": read tcp 192.168.1.99:49613-><valid public address>:443: read: connection reset by peer"

[optional *Relevant Links*:]

When I do this on nginx I do not get these errors.

I added these to my custom proxy config and it seemed to fix the issue
``sh
shutdown:
drainTimeout: 600s
minDrainDuration: 60s

However there's no documentation on this. I happened to find it with kube-explain

I'm on v1.0.1
>Any extra documentation required to understand the issue.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions