-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Description
When using EDS to update the endpoint configuration dynamically, DNS resolution is not allowed for hostname in EDS response, as the comment says:
// The form of host address depends on the given cluster type. For STATIC or EDS,
// it is expected to be a direct IP address (or something resolvable by the
// specified :ref:resolver <envoy_api_field_core.SocketAddress.resolver_name>
// in the Address). For LOGICAL or STRICT DNS, it is expected to be hostname,
// and will be resolved via DNS.
and the DNS is now allowed for customer resolver
// The name of the custom resolver. This must have been registered with Envoy. If
// this is empty, a context dependent default applies. If the address is a concrete
// IP address, no resolution will occur. If address is a hostname this
// should be set for resolution other than DNS. Specifying a custom resolver with
// STRICT_DNS or LOGICAL_DNS will generate an error at runtime.
When adding some external services to the load balancing endpoints via EDS. It would be impractical to add the IP of the endpoints considering service VMs can and will go up and down, and the service's endpoint IP addresses will change frequently. Supporting hostname in EDS response seems to be a reasonable solution.
Not sure if not supporting DNS resolution for hostname is by design or the Envoy restriction. Please let me know and I can help working on this feature if Envoy needs it.
In istio, we use priority filed to implement failover logic. When the endpoints in higher priority are down, the load balancer will select the endpoints with lower priority. It assumes that all the endpoints have the same settings (e.g. TLS context). But sometimes it may be different. For example, for external fallback service, mTLS is not required, but inside the service mesh, mTLS is required. If the external service endpoints and internal service endpoints are added into one cluster. The traffic to external endpoint will be broken.
Downstream Envoy setting:
clusters:
- name: proxy
type: strict_dns
lb_policy: round_robin
load_assignment:
cluster_name: proxy
endpoints:
- lb_endpoints:
priority: 1
- endpoint:
address:
socket_address:
address: proxy
port_value: 80
- lb_endpoints:
priority: 0
- endpoint:
address:
socket_address:
address: proxy
port_value: 443
tls_context: {}Upstream listener setting:
listeners:
- address:
socket_address:
address: 0.0.0.0
port_value: 80
...
- address:
socket_address:
address: 0.0.0.0
port_value: 443
...
tls_context:
common_tls_context:
tls_certificates:
- certificate_chain:
filename: /etc/cert.pem
private_key:
filename: /etc/key.pem
validation_context: {}When the proxy:443 is down, the traffic to proxy:80 will be broken as well because proxy:80 doesn't support mTLS.
Thanks @PiotrSikora for the solution. Allow load balancer to fallback to another cluster would solve the problem. For above case, split the cluster configuration into two clusters and load balancer can select another cluster when one cluster is down and use the setting for the cluster.