Setting respect_dns_ttl to `true` may result in excessive dns requests when hosts/endpoints DNS returns an NXDOMAIN

We've noticed inside our org after adding `respect_dns_ttl` to all clusters globally, that when Envoy finds a host that results in an NXDOMAIN on an attempted DNS lookup, Envoy appears to continue to retry the lookup infinitely, to excess.

I think that the following config is sufficient to reproduce the issue:

```yaml
static_resources:
  clusters:
  - name: test
    connect_timeout: 5s
    type: STRICT_DNS
    lb_policy: ROUND_ROBIN
    respect_dns_ttl: true
    load_assignment:
      cluster_name: test
      endpoints:
        - lb_endpoints:
            - endpoint:
                address:
                  socket_address:
                    address: does-not-exist-asdkjashdaskfhasalsfhas.com
                    port_value: 8080
```

I think this logic starts somewhere here

https://github.com/envoyproxy/envoy/blob/e0e7628c3bc4227245f15c4f047ddad04912351c/source/common/upstream/strict_dns_cluster.cc#L14

and I think it would be good if this fell back on the default DNS refresh rate in the case of an NXDOMAIN or similar error resulting in a lack of a TTL

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Setting respect_dns_ttl to `true` may result in excessive dns requests when hosts/endpoints DNS returns an NXDOMAIN #7808

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Setting respect_dns_ttl to true may result in excessive dns requests when hosts/endpoints DNS returns an NXDOMAIN #7808

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Setting respect_dns_ttl to `true` may result in excessive dns requests when hosts/endpoints DNS returns an NXDOMAIN #7808