Skip to content

Setting respect_dns_ttl to true may result in excessive dns requests when hosts/endpoints DNS returns an NXDOMAIN #7808

@cetanu

Description

@cetanu

We've noticed inside our org after adding respect_dns_ttl to all clusters globally, that when Envoy finds a host that results in an NXDOMAIN on an attempted DNS lookup, Envoy appears to continue to retry the lookup infinitely, to excess.

I think that the following config is sufficient to reproduce the issue:

static_resources:
  clusters:
  - name: test
    connect_timeout: 5s
    type: STRICT_DNS
    lb_policy: ROUND_ROBIN
    respect_dns_ttl: true
    load_assignment:
      cluster_name: test
      endpoints:
        - lb_endpoints:
            - endpoint:
                address:
                  socket_address:
                    address: does-not-exist-asdkjashdaskfhasalsfhas.com
                    port_value: 8080

I think this logic starts somewhere here

and I think it would be good if this fell back on the default DNS refresh rate in the case of an NXDOMAIN or similar error resulting in a lack of a TTL

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions