Skip to content

Envoy stops use dns_resolvers from cluster after network restart #13794

@rumanzo

Description

@rumanzo

Description:
We use standalone envoy on centos 8 with kubernetes installation. When on node happens restart network (systemctl restart network.service for example), envoy stops query resolvers defined in dns_resolvers section, and start to use system resolve (dns servers defined in /etc/resolv.conf).From the documentation it looks like the behavior is reset to use the default resolver
https://www.envoyproxy.io/docs/envoy/latest/api-v2/api/v2/cluster.proto

dns_resolvers

If this setting is not specified, the value defaults to the default resolver, which uses /etc/resolv.conf for configuration.

After network restart envoy not use defined resolvers anymore (until restart envoy instance), and after pod state changes in k8s (deploy), upstreams no longer available.
Is there a way to prevent this behavior?

Repro steps:
systemctl restart network
tcpdump dns resolves, try to change upstream count\address

Config:

  clusters:
    - name: somename
      alt_stat_name: somename
      connect_timeout: 1s
      common_http_protocol_options: {idle_timeout: 5s}
      type: STRICT_DNS
      dns_lookup_family: 1
      use_tcp_for_dns_lookups: true
      load_assignment:
        cluster_name: somename
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: somename-headless.namespace.svc.cluster.local.
                      port_value: 80
      dns_resolvers:
        - {socket_address: {address: "10.0.0.1", port_value: 53}}
        - {socket_address: {address: "10.0.0.2", port_value: 53}}
        - {socket_address: {address: "10.0.0.3", port_value: 53}}
      dns_refresh_rate: 1s
      health_checks:
        - tcp_health_check: {send: {binary: ""}}
          interval:
            nanos: 300000000
          timeout:
            seconds: 1
          unhealthy_threshold:
            value: 3
          healthy_threshold:
            value: 1

Logs:

debug][upstream] [source/common/upstream/strict_dns_cluster.cc:167] DNS refresh rate reset for somename-headless.namespace.svc.cluster.local., refresh rate 1000 ms
debug][upstream] [source/common/upstream/upstream_impl.cc:286] transport socket match, socket default selected for host with address 10.0.1.1:80
debug][upstream] [source/common/upstream/upstream_impl.cc:286] transport socket match, socket default selected for host with address 10.0.1.2:80
debug][upstream] [source/common/upstream/upstream_impl.cc:286] transport socket match, socket default selected for host with address 10.0.1.3:80
debug][upstream] [source/common/upstream/strict_dns_cluster.cc:167] DNS refresh rate reset for somename-headless.namespace.svc.cluster.local., refresh rate 1000 ms
debug][main] [source/server/server.cc:190] flushing stats
debug][upstream] [source/common/upstream/strict_dns_cluster.cc:174] DNS refresh rate reset for somename-headless.namespace.svc.cluster.local., (failure) refresh rate 1000 ms
debug][upstream] [source/common/upstream/strict_dns_cluster.cc:174] DNS refresh rate reset for somename-headless.namespace.svc.cluster.local., (failure) refresh rate 1000 ms
debug][upstream] [source/common/upstream/strict_dns_cluster.cc:174] DNS refresh rate reset for somename-headless.namespace.svc.cluster.local., (failure) refresh rate 1000 ms
debug][upstream] [source/common/upstream/strict_dns_cluster.cc:174] DNS refresh rate reset for somename-headless.namespace.svc.cluster.local., (failure) refresh rate 1000 ms
debug][upstream] [source/common/upstream/strict_dns_cluster.cc:174] DNS refresh rate reset for somename-headless.namespace.svc.cluster.local., (failure) refresh rate 1000 ms
debug][main] [source/server/server.cc:190] flushing stats

Versions:
OS: centos 8.2.2004
Envoy: 1.16

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions